Set the Module's target appropriately when it's first created.
Compile separate 32 and 64 bit versions of the builtins-c bitcocde
and load the appropriate one based on the target we're compiling
for.
Just pulling out the elements and doing a set of scalar equality tests
is the best approach for those (nearly 2x better than the rotate and
vector equality check that we use for 32-bit stuff).
These get slightly wrong results for zero and the denorms and also
don't handle the Inf/NaN stuff correctly, but are much more efficient
than the full versions of these routines.
- Renamed stdlib-sse.ll to builtins-sse.ll (etc.) in an attempt to better indicate
the fact that the stuff in those files has a role beyond implementing stuff for
the standard library.
- Moved declarations of the various __pseudo_* functions from being done with LLVM
API calls in builtins.cpp to just straight up declarations in LLVM assembly
language in builtins.m4. (Much less code to do it this way, and more clear what's
going on.)
Fixes issue #77. Previously, it dumped out the entire module every time
a new function was defined, which got to be quite a lot of output by
the time the stdlib functions were all added!
Fixes issue #73. Previously, if we had e.g. an int16 type that was being shifted
left by 1, then the constant integer 1 would come in as an int32, we'd convert
the int16 to an int32, and then we'd do the shift. Now, for shifts, the type
of the expression is always the same as the type of the value being shifted.
Add optimization patterns to detect and simplify masked loads and stores
with the mask all on / all off.
Enable AVX for LLVM 3.0 builds (still generally hits bugs / unimplemented
stuff on the LLVM side, but it's getting there).
(The string for which c_str() was called was just a temporary, so its destructor ran after funcName was initialized, leading funcName to point at freed memory.)
This commit adds support for swizzles like "foo.zy" (if "foo" is,
for example, a float<3> type) as rvalues. (Still need support for
swizzles as lvalues.)
Don't create ispc-callable symbols for other functions that we find in the LLVM
bitcode files that are loaded up and linked into the module so that they can be
called from ispc stdlib functions. This fixes an issue where we had a clash
between the declared versions of double sin(double) and the corresponding
ispc stdlib routines for uniform doubles, which in turn led to bogus code
being generated for calls to those ispc stdlib functions.
- In the ispc-generated header files, a #define now indicates which compilation target
was used.
- The examples use utility routines from the new file examples/cpuid.h to check the
system's CPU's capabilities to see if it supports the ISA that was used for
compiling the example code and print error messages if things aren't going to
work...
This way, we match C/C++ in that casting a bool to an int gives either the value
zero or the value one. There is a new stdlib function int sign_extend(bool)
that does sign extension for cases where that's desired.