Flag 32-bit vector types as only requiring 32-bit alignment (preemptive
bug fix for 32xi1 vectors).
Force module datalayouts to be the same before linking them to silence
an LLVM warning.
Finishes issue #309.
When "break", "continue", or "return" is used under varying control flow,
we now always check the execution mask to see if all of the program
instances are executing it. (Previously, this was only done with "cbreak",
"ccontinue", and "creturn", which are now deprecated.)
An important effect of this change is that it fixes a family of cases
where we could end up running with an "all off" execution mask, which isn't
supposed to happen, as it leads to all sorts of invalid behavior.
This change does cause the volume rendering example to run 9% slower, but
doesn't affect the other examples.
Issue #257.
This fixes a crash if 'cbreak' was used in a 'switch'. Renamed
FunctionEmitContext::SetLoopMask() to SetBlockEntryMask(), and
similarly the loopMask member variable.
Fixes to __load and __store.
Added __add, __mul, __equal, __not_equal, __extract_elements, __smear_i64, __cast_sext, __cast_zext,
and __scatter_base_offsets32_float.
__rcp_varying_float now has a fast-math and full-precision implementation.
Some modules require an include of unistd.h (e.g. for getcwd and isatty
definitions).
These changes were required to build successfully on a Fedora 17 system,
using GCC 4.7.0 & glibc-headers 2.15.
If --opt=fast-math is used then the generated code contains:
#define ISPC_FAST_MATH 1
Otherwise it contains:
#undef ISPC_FAST_MATH
This allows the generic headers to support the user's request.
This should help with performance of the generated code.
Updated the relevant header files (sse4.h, generic-16.h, generic-32.h, generic-64.h)
Updated generic-32.h and generic-64.h to the new memory API
(Rather than implicitly with a using declaration.) This will
allow for some further changes to ISPC's C backend, without collision
with ISPC's namespace. This change aims to have no effect on the code
generated by the compiler, it should be a big no-op; except for its
side-effects on maintainability.
We need to do this since it's illegal to have nested foreach statements, but
nested foreach_unique, or foreach_unique inside foreach, etc., are all fine.
It's now legal to write:
struct Foo { Foo *next; };
previously, a predeclaration "struct Foo;" was required. This fixes
issue #287.
This change also fixes a bug where multiple forward declarations
"struct Foo; struct Foo;" would incorrectly issue an error on the
second one.
The string to be printed is accumulated into a local buffer before being sent to
puts(). This ensure that if multiple threads are running and printing at the
same time, their output won't be interleaved (across individual print statements--
it still may be interleaved across different print statements, just like in C).
Issue #293.
In particular, this gives us desired behavior for NaNs (all compares
involving a NaN evaluate to true). This in turn allows writing the
canonical isnan() function as "v != v".
Added isnan() to the standard library as well.