Add support for emitting ~generic vectorized C++ code.
The compiler now supports an --emit-c++ option, which generates generic vector C++ code. To actually compile this code, the user must provide C++ code that implements a variety of types and operations (e.g. adding two floating-point vector values together, comparing them, etc). There are two examples of this required code in examples/intrinsics: generic-16.h is a "generic" 16-wide implementation that does all required with scalar math; it's useful for demonstrating the requirements of the implementation. Then, sse4.h shows a simple implementation of a SSE4 target that maps the emitted function calls to SSE intrinsics. When using these example implementations with the ispc test suite, all but one or two tests pass with gcc and clang on Linux and OSX. There are currently ~10 failures with icc on Linux, and ~50 failures with MSVC 2010. (To be fixed in coming days.) Performance varies: when running the examples through the sse4.h target, some have the same performance as when compiled with --target=sse4 from ispc directly (options), while noise is 12% slower, rt is 26% slower, and aobench is 2.2x slower. The details of this haven't yet been carefully investigated, but will be in coming days as well. Issue #92.
This commit is contained in:
10
Makefile
10
Makefile
@@ -57,9 +57,9 @@ YACC=bison -d -v -t
|
||||
|
||||
###########################################################################
|
||||
|
||||
CXX_SRC=ast.cpp builtins.cpp ctx.cpp decl.cpp expr.cpp func.cpp ispc.cpp \
|
||||
llvmutil.cpp main.cpp module.cpp opt.cpp stmt.cpp sym.cpp type.cpp \
|
||||
util.cpp
|
||||
CXX_SRC=ast.cpp builtins.cpp cbackend.cpp ctx.cpp decl.cpp expr.cpp func.cpp \
|
||||
ispc.cpp llvmutil.cpp main.cpp module.cpp opt.cpp stmt.cpp sym.cpp \
|
||||
type.cpp util.cpp
|
||||
HEADERS=ast.h builtins.h ctx.h decl.h expr.h func.h ispc.h llvmutil.h module.h \
|
||||
opt.h stmt.h sym.h type.h util.h
|
||||
TARGETS=avx avx-x2 sse2 sse2-x2 sse4 sse4-x2 generic-4 generic-8 generic-16
|
||||
@@ -107,6 +107,10 @@ objs/%.o: %.cpp
|
||||
@echo Compiling $<
|
||||
@$(CXX) $(CXXFLAGS) -o $@ -c $<
|
||||
|
||||
objs/cbackend.o: cbackend.cpp
|
||||
@echo Compiling $<
|
||||
@$(CXX) -fno-rtti -fno-exceptions $(CXXFLAGS) -o $@ -c $<
|
||||
|
||||
objs/%.o: objs/%.cpp
|
||||
@echo Compiling $<
|
||||
@$(CXX) $(CXXFLAGS) -o $@ -c $<
|
||||
|
||||
@@ -12,7 +12,9 @@ length=0
|
||||
src=str(sys.argv[1])
|
||||
|
||||
target = re.sub("builtins/target-", "", src)
|
||||
target = re.sub(r"builtins\\target-", "", target)
|
||||
target = re.sub("builtins/", "", target)
|
||||
target = re.sub(r"builtins\\", "", target)
|
||||
target = re.sub("\.ll$", "", target)
|
||||
target = re.sub("\.c$", "", target)
|
||||
target = re.sub("-", "_", target)
|
||||
|
||||
4342
cbackend.cpp
Normal file
4342
cbackend.cpp
Normal file
File diff suppressed because it is too large
Load Diff
@@ -56,6 +56,7 @@ Contents:
|
||||
|
||||
+ `Basic Command-line Options`_
|
||||
+ `Selecting The Compilation Target`_
|
||||
+ `Generating Generic C++ Output`_
|
||||
+ `Selecting 32 or 64 Bit Addressing`_
|
||||
+ `The Preprocessor`_
|
||||
+ `Debugging`_
|
||||
@@ -432,6 +433,65 @@ Intel® SSE2, use ``--target=sse2``. (As with the other options in this
|
||||
section, see the output of ``ispc --help`` for a full list of supported
|
||||
targets.)
|
||||
|
||||
Generating Generic C++ Output
|
||||
-----------------------------
|
||||
|
||||
In addition to generating object files or assembly output for specific
|
||||
targets like SSE2, SSE4, and AVX, ``ispc`` provides an option to generate
|
||||
"generic" C++ output. This
|
||||
|
||||
As an example, consider the following simple ``ispc`` program:
|
||||
|
||||
::
|
||||
|
||||
int foo(int i, int j) {
|
||||
return (i < 0) ? 0 : i + j;
|
||||
}
|
||||
|
||||
If this program is compiled with the following command:
|
||||
|
||||
::
|
||||
|
||||
ispc foo.ispc --emit-c++ --target=generic-4 -o foo.cpp
|
||||
|
||||
Then ``foo()`` is compiled to the following C++ code (after various
|
||||
automatically-generated boilerplate code):
|
||||
|
||||
::
|
||||
|
||||
__vec4_i32 foo(__vec4_i32 i_llvm_cbe, __vec4_i32 j_llvm_cbe,
|
||||
__vec4_i1 __mask_llvm_cbe) {
|
||||
return (__select((__signed_less_than(i_llvm_cbe,
|
||||
__vec4_i32 (0u, 0u, 0u, 0u))),
|
||||
__vec4_i32 (0u, 0u, 0u, 0u),
|
||||
(__add(i_llvm_cbe, j_llvm_cbe))));
|
||||
}
|
||||
|
||||
Note that the original computation has been expressed in terms of a number
|
||||
of vector types (e.g. ``__vec4_i32`` for a 4-wide vector of 32-bit integers
|
||||
and ``__vec4_i1`` for a 4-wide vector of boolean values) and in terms of
|
||||
vector operations on these types like ``__add()`` and ``__select()``).
|
||||
|
||||
You are then free to provide your own implementations of these types and
|
||||
functions. For example, you might want to target a specific vector ISA, or
|
||||
you might want to instrument these functions for performance measurements.
|
||||
|
||||
There is an example implementation of 4-wide variants of the required
|
||||
functions, suitable for use with the ``generic-4`` target in the file
|
||||
``examples/intrinsics/sse4.h``, and there is an example straightforward C
|
||||
implementation of the 16-wide variants for the ``generic-16`` target in the
|
||||
file ``examples/intrinsics/generic-16.h``. There is not yet comprehensive
|
||||
documentation of these types and the functions that must be provided for
|
||||
them when the C++ target is used, but a review of those two files should
|
||||
provide the basic context.
|
||||
|
||||
If you are using C++ source emission, you may also find the
|
||||
``--c++-include-file=<filename>`` command line argument useful; it adds an
|
||||
``#include`` statement with the given filename at the top of the emitted
|
||||
C++ file; this can be used to easily include specific implementations of
|
||||
the vector types and functions.
|
||||
|
||||
|
||||
Selecting 32 or 64 Bit Addressing
|
||||
---------------------------------
|
||||
|
||||
|
||||
1428
examples/intrinsics/generic-16.h
Normal file
1428
examples/intrinsics/generic-16.h
Normal file
File diff suppressed because it is too large
Load Diff
3665
examples/intrinsics/sse4.h
Normal file
3665
examples/intrinsics/sse4.h
Normal file
File diff suppressed because it is too large
Load Diff
@@ -13,6 +13,7 @@
|
||||
<ItemGroup>
|
||||
<ClCompile Include="ast.cpp" />
|
||||
<ClCompile Include="builtins.cpp" />
|
||||
<ClCompile Include="cbackend.cpp" />
|
||||
<ClCompile Include="ctx.cpp" />
|
||||
<ClCompile Include="decl.cpp" />
|
||||
<ClCompile Include="expr.cpp" />
|
||||
|
||||
49
main.cpp
49
main.cpp
@@ -66,11 +66,15 @@ static void usage(int ret) {
|
||||
printf(" \t\ton 64-bit target architectures.)\n");
|
||||
printf(" [--arch={%s}]\t\tSelect target architecture\n",
|
||||
Target::SupportedTargetArchs());
|
||||
printf(" [--c++-include-file=<name>]\t\tSpecify name of file to emit in #include statement in generated C++ code.\n");
|
||||
printf(" [--cpu=<cpu>]\t\t\tSelect target CPU type\n");
|
||||
printf(" <cpu>={%s}\n", Target::SupportedTargetCPUs());
|
||||
printf(" [-D<foo>]\t\t\t\t#define given value when running preprocessor\n");
|
||||
printf(" [--debug]\t\t\t\tPrint information useful for debugging ispc\n");
|
||||
printf(" [--emit-asm]\t\t\tGenerate assembly language file as output\n");
|
||||
#ifndef LLVM_2_9
|
||||
printf(" [--emit-c++]\t\t\tEmit a C++ source file as output\n");
|
||||
#endif // !LLVM_2_9
|
||||
printf(" [--emit-llvm]\t\t\tEmit LLVM bitode file as output\n");
|
||||
printf(" [--emit-obj]\t\t\tGenerate object file file as output (default)\n");
|
||||
printf(" [-g]\t\t\t\tGenerate debugging information\n");
|
||||
@@ -187,6 +191,7 @@ int main(int Argc, char *Argv[]) {
|
||||
char *file = NULL;
|
||||
const char *headerFileName = NULL;
|
||||
const char *outFileName = NULL;
|
||||
const char *includeFileName = NULL;
|
||||
|
||||
// Initiailize globals early so that we can set various option values
|
||||
// as we're parsing below
|
||||
@@ -236,13 +241,20 @@ int main(int Argc, char *Argv[]) {
|
||||
}
|
||||
else if (!strcmp(argv[i], "--emit-asm"))
|
||||
ot = Module::Asm;
|
||||
#ifndef LLVM_2_9
|
||||
else if (!strcmp(argv[i], "--emit-c++"))
|
||||
ot = Module::CXX;
|
||||
#endif // !LLVM_2_9
|
||||
else if (!strcmp(argv[i], "--emit-llvm"))
|
||||
ot = Module::Bitcode;
|
||||
else if (!strcmp(argv[i], "--emit-obj"))
|
||||
ot = Module::Object;
|
||||
else if (!strcmp(argv[i], "--target")) {
|
||||
// FIXME: should remove this way of specifying the target...
|
||||
if (++i == argc) usage(1);
|
||||
if (++i == argc) {
|
||||
fprintf(stderr, "No target specified after --target option.\n");
|
||||
usage(1);
|
||||
}
|
||||
target = argv[i];
|
||||
}
|
||||
else if (!strncmp(argv[i], "--target=", 9))
|
||||
@@ -257,8 +269,10 @@ int main(int Argc, char *Argv[]) {
|
||||
g->mathLib = Globals::Math_SVML;
|
||||
else if (!strcmp(lib, "system"))
|
||||
g->mathLib = Globals::Math_System;
|
||||
else
|
||||
else {
|
||||
fprintf(stderr, "Unknown --math-lib= option \"%s\".\n", lib);
|
||||
usage(1);
|
||||
}
|
||||
}
|
||||
else if (!strncmp(argv[i], "--opt=", 6)) {
|
||||
const char *opt = argv[i] + 6;
|
||||
@@ -291,8 +305,10 @@ int main(int Argc, char *Argv[]) {
|
||||
g->opt.disableGatherScatterFlattening = true;
|
||||
else if (!strcmp(opt, "disable-uniform-memory-optimizations"))
|
||||
g->opt.disableUniformMemoryOptimizations = true;
|
||||
else
|
||||
else {
|
||||
fprintf(stderr, "Unknown --opt= option \"%s\".\n", opt);
|
||||
usage(1);
|
||||
}
|
||||
}
|
||||
else if (!strcmp(argv[i], "--woff") || !strcmp(argv[i], "-woff")) {
|
||||
g->disableWarnings = true;
|
||||
@@ -305,18 +321,27 @@ int main(int Argc, char *Argv[]) {
|
||||
else if (!strcmp(argv[i], "--wno-perf") || !strcmp(argv[i], "-wno-perf"))
|
||||
g->emitPerfWarnings = false;
|
||||
else if (!strcmp(argv[i], "-o")) {
|
||||
if (++i == argc) usage(1);
|
||||
if (++i == argc) {
|
||||
fprintf(stderr, "No output file specified after -o option.\n");
|
||||
usage(1);
|
||||
}
|
||||
outFileName = argv[i];
|
||||
}
|
||||
else if (!strcmp(argv[i], "--outfile="))
|
||||
outFileName = argv[i] + strlen("--outfile=");
|
||||
else if (!strcmp(argv[i], "-h")) {
|
||||
if (++i == argc) usage(1);
|
||||
if (++i == argc) {
|
||||
fprintf(stderr, "No header file name specified after -h option.\n");
|
||||
usage(1);
|
||||
}
|
||||
headerFileName = argv[i];
|
||||
}
|
||||
else if (!strcmp(argv[i], "--header-outfile=")) {
|
||||
else if (!strncmp(argv[i], "--header-outfile=", 17)) {
|
||||
headerFileName = argv[i] + strlen("--header-outfile=");
|
||||
}
|
||||
else if (!strncmp(argv[i], "--c++-include-file=", 19)) {
|
||||
includeFileName = argv[i] + strlen("--c++-include-file=");
|
||||
}
|
||||
else if (!strcmp(argv[i], "-O0")) {
|
||||
g->opt.level = 0;
|
||||
optSet = true;
|
||||
@@ -341,11 +366,16 @@ int main(int Argc, char *Argv[]) {
|
||||
BUILD_DATE, BUILD_VERSION);
|
||||
return 0;
|
||||
}
|
||||
else if (argv[i][0] == '-')
|
||||
else if (argv[i][0] == '-') {
|
||||
fprintf(stderr, "Unknown option \"%s\".\n", argv[i]);
|
||||
usage(1);
|
||||
}
|
||||
else {
|
||||
if (file != NULL)
|
||||
if (file != NULL) {
|
||||
fprintf(stderr, "Multiple input files specified on command "
|
||||
"line: \"%s\" and \"%s\".\n", file, argv[i]);
|
||||
usage(1);
|
||||
}
|
||||
else
|
||||
file = argv[i];
|
||||
}
|
||||
@@ -363,5 +393,6 @@ int main(int Argc, char *Argv[]) {
|
||||
"be issued, but no output will be generated.");
|
||||
|
||||
return Module::CompileAndOutput(file, arch, cpu, target, generatePIC,
|
||||
ot, outFileName, headerFileName);
|
||||
ot, outFileName, headerFileName,
|
||||
includeFileName);
|
||||
}
|
||||
|
||||
41
module.cpp
41
module.cpp
@@ -76,7 +76,6 @@
|
||||
#include <llvm/Target/TargetMachine.h>
|
||||
#include <llvm/Target/TargetOptions.h>
|
||||
#include <llvm/Target/TargetData.h>
|
||||
#include <llvm/PassManager.h>
|
||||
#include <llvm/Analysis/Verifier.h>
|
||||
#include <llvm/Support/CFG.h>
|
||||
#include <clang/Frontend/CompilerInstance.h>
|
||||
@@ -584,7 +583,8 @@ Module::AddFunctionDefinition(Symbol *sym, const std::vector<Symbol *> &args,
|
||||
|
||||
|
||||
bool
|
||||
Module::writeOutput(OutputType outputType, const char *outFileName) {
|
||||
Module::writeOutput(OutputType outputType, const char *outFileName,
|
||||
const char *includeFileName) {
|
||||
#if defined(LLVM_3_0) || defined(LLVM_3_0svn) || defined(LLVM_3_1svn)
|
||||
if (diBuilder != NULL && outputType != Header)
|
||||
diBuilder->finalize();
|
||||
@@ -610,6 +610,14 @@ Module::writeOutput(OutputType outputType, const char *outFileName) {
|
||||
if (strcasecmp(suffix, "o") && strcasecmp(suffix, "obj"))
|
||||
fileType = "object";
|
||||
break;
|
||||
#ifndef LLVM_2_9
|
||||
case CXX:
|
||||
if (strcasecmp(suffix, "c") && strcasecmp(suffix, "cc") &&
|
||||
strcasecmp(suffix, "c++") && strcasecmp(suffix, "cxx") &&
|
||||
strcasecmp(suffix, "cpp"))
|
||||
fileType = "c++";
|
||||
break;
|
||||
#endif // !LLVM_2_9
|
||||
case Header:
|
||||
if (strcasecmp(suffix, "h") && strcasecmp(suffix, "hh") &&
|
||||
strcasecmp(suffix, "hpp"))
|
||||
@@ -623,12 +631,18 @@ Module::writeOutput(OutputType outputType, const char *outFileName) {
|
||||
|
||||
if (outputType == Header)
|
||||
return writeHeader(outFileName);
|
||||
else {
|
||||
if (outputType == Bitcode)
|
||||
return writeBitcode(module, outFileName);
|
||||
else
|
||||
return writeObjectFileOrAssembly(outputType, outFileName);
|
||||
else if (outputType == Bitcode)
|
||||
return writeBitcode(module, outFileName);
|
||||
#ifndef LLVM_2_9
|
||||
else if (outputType == CXX) {
|
||||
extern bool WriteCXXFile(llvm::Module *module, const char *fn,
|
||||
int vectorWidth, const char *includeName);
|
||||
return WriteCXXFile(module, outFileName, g->target.vectorWidth,
|
||||
includeFileName);
|
||||
}
|
||||
#endif // !LLVM_2_9
|
||||
else
|
||||
return writeObjectFileOrAssembly(outputType, outFileName);
|
||||
}
|
||||
|
||||
|
||||
@@ -1568,7 +1582,8 @@ lCreateDispatchModule(std::map<std::string, FunctionTargetVariants> &functions)
|
||||
int
|
||||
Module::CompileAndOutput(const char *srcFile, const char *arch, const char *cpu,
|
||||
const char *target, bool generatePIC, OutputType outputType,
|
||||
const char *outFileName, const char *headerFileName) {
|
||||
const char *outFileName, const char *headerFileName,
|
||||
const char *includeFileName) {
|
||||
if (target == NULL || strchr(target, ',') == NULL) {
|
||||
// We're only compiling to a single target
|
||||
if (!Target::GetTarget(arch, cpu, target, generatePIC, &g->target))
|
||||
@@ -1577,7 +1592,7 @@ Module::CompileAndOutput(const char *srcFile, const char *arch, const char *cpu,
|
||||
m = new Module(srcFile);
|
||||
if (m->CompileFile() == 0) {
|
||||
if (outFileName != NULL)
|
||||
if (!m->writeOutput(outputType, outFileName))
|
||||
if (!m->writeOutput(outputType, outFileName, includeFileName))
|
||||
return 1;
|
||||
if (headerFileName != NULL)
|
||||
if (!m->writeOutput(Module::Header, headerFileName))
|
||||
@@ -1590,6 +1605,14 @@ Module::CompileAndOutput(const char *srcFile, const char *arch, const char *cpu,
|
||||
return errorCount > 0;
|
||||
}
|
||||
else {
|
||||
#ifndef LLVM_2_9
|
||||
if (outputType == CXX) {
|
||||
Error(SourcePos(), "Illegal to specify more then one target when "
|
||||
"compiling C++ output.");
|
||||
return 1;
|
||||
}
|
||||
#endif // !LLVM_2_9
|
||||
|
||||
// The user supplied multiple targets
|
||||
std::vector<std::string> targets = lExtractTargets(target);
|
||||
Assert(targets.size() > 1);
|
||||
|
||||
13
module.h
13
module.h
@@ -80,6 +80,9 @@ public:
|
||||
enum OutputType { Asm, /** Generate text assembly language output */
|
||||
Bitcode, /** Generate LLVM IR bitcode output */
|
||||
Object, /** Generate a native object file */
|
||||
#ifndef LLVM_2_9
|
||||
CXX, /** Generate a C++ file */
|
||||
#endif // !LLVM_2_9
|
||||
Header /** Generate a C/C++ header file with
|
||||
declarations of 'export'ed functions, global
|
||||
variables, and the types used by them. */
|
||||
@@ -108,6 +111,10 @@ public:
|
||||
inclusion from C/C++ code with declarations of
|
||||
types and functions exported from the given ispc
|
||||
source file.
|
||||
@param includeFileName If non-NULL, gives the filename for the C++
|
||||
backend to emit in an #include statement to
|
||||
get definitions of the builtins for the generic
|
||||
target.
|
||||
@return Number of errors encountered when compiling
|
||||
srcFile.
|
||||
*/
|
||||
@@ -115,7 +122,8 @@ public:
|
||||
const char *cpu, const char *targets,
|
||||
bool generatePIC, OutputType outputType,
|
||||
const char *outFileName,
|
||||
const char *headerFileName);
|
||||
const char *headerFileName,
|
||||
const char *includeFileName);
|
||||
|
||||
/** Total number of errors encountered during compilation. */
|
||||
int errorCount;
|
||||
@@ -138,7 +146,8 @@ private:
|
||||
true on success, false if there has been an error. The given
|
||||
filename may be NULL, indicating that output should go to standard
|
||||
output. */
|
||||
bool writeOutput(OutputType ot, const char *filename);
|
||||
bool writeOutput(OutputType ot, const char *filename,
|
||||
const char *includeFileName = NULL);
|
||||
bool writeHeader(const char *filename);
|
||||
bool writeObjectFileOrAssembly(OutputType outputType, const char *filename);
|
||||
static bool writeObjectFileOrAssembly(llvm::TargetMachine *targetMachine,
|
||||
|
||||
10
opt.cpp
10
opt.cpp
@@ -184,10 +184,12 @@ Optimize(llvm::Module *module, int optLevel) {
|
||||
llvm::PassManager optPM;
|
||||
llvm::FunctionPassManager funcPM(module);
|
||||
|
||||
llvm::TargetLibraryInfo *targetLibraryInfo =
|
||||
new llvm::TargetLibraryInfo(llvm::Triple(module->getTargetTriple()));
|
||||
optPM.add(targetLibraryInfo);
|
||||
optPM.add(new llvm::TargetData(module));
|
||||
if (g->target.isa != Target::GENERIC) {
|
||||
llvm::TargetLibraryInfo *targetLibraryInfo =
|
||||
new llvm::TargetLibraryInfo(llvm::Triple(module->getTargetTriple()));
|
||||
optPM.add(targetLibraryInfo);
|
||||
optPM.add(new llvm::TargetData(module));
|
||||
}
|
||||
|
||||
#if defined(LLVM_3_0) || defined(LLVM_3_0svn) || defined(LLVM_3_1svn)
|
||||
optPM.add(llvm::createIndVarSimplifyPass());
|
||||
|
||||
Reference in New Issue
Block a user