(This in particular would happen when there was an error in the body of a struct
definition and we were left with an UndefinedStructType and then later tried to
do loads/stores from/to it.)
Issue #356.
When "break", "continue", or "return" is used under varying control flow,
we now always check the execution mask to see if all of the program
instances are executing it. (Previously, this was only done with "cbreak",
"ccontinue", and "creturn", which are now deprecated.)
An important effect of this change is that it fixes a family of cases
where we could end up running with an "all off" execution mask, which isn't
supposed to happen, as it leads to all sorts of invalid behavior.
This change does cause the volume rendering example to run 9% slower, but
doesn't affect the other examples.
Issue #257.
This fixes a crash if 'cbreak' was used in a 'switch'. Renamed
FunctionEmitContext::SetLoopMask() to SetBlockEntryMask(), and
similarly the loopMask member variable.
We need to do this since it's illegal to have nested foreach statements, but
nested foreach_unique, or foreach_unique inside foreach, etc., are all fine.
Previously, we'd bitcast e.g. a vector of floats to a vector of i32s and then
use the i32 variant of masked_load/masked_store/gather/scatter. Now, we have
separate float/double variants of each of those.
It can sometimes be useful to know the general place we were in the program
when an assertion hit; when the position is available / applicable, this
macro is now used.
Issue #268.
We now have a set of template functions CastType<AtomicType>, etc., that in
turn use a new typeId field in each Type instance, allowing them to be inlined
and to be quite efficient.
This improves front-end performance for a particular large program by 28%.
It's now possible to successfully print out the value of programIndex,
programCount, etc., in the debugger. The issue was that they were
defined as having InternalLinkage, which meant that DCE removed them
at the end of compilation. Now they're declared to have WeakODRLinkage,
which ensures that one copy survives (but there aren't multiply-defined
symbols when compiling multiple files.)
The DIType passed to this method should correspond to the
FunctionType of the function, not its return type.
The first parameter should be the DIScope for the compile unit,
not the DIFile.
We previously had the unmangled function name and the mangled
function name interchanged.
The argument corresponding to "first line number of the function" was
missing, which in turn led to subsequent arguments being off, and thus
providing bogus values vs. what was supposed to be passed.
Rename FunctionEmitContext::diFunction to diSubprogram, to better
reflect its type.
We now try harder to keep the names of instructions related to the
initial names of variables they're derived from and so forth. This
is useful for making both LLVM IR as well as generated C++ code
easier to correlate back to the original ispc source code.
Issue #244.
Now a declaration like 'struct Foo;' can be used to establish the
name of a struct type, without providing a definition. One can
pass pointers to such types around the system, but can't do much
else with them (as in C/C++).
Issue #125.
With AOS data, we can often coalesce the accesses into gathers for the main
part of foreach loops but only fail on the last bits where the mask is not
all on (since the coalescing code doesn't handle mixed masks, yet.) Before,
we'd report success with coalescing and then also report that gathers were needed
for the same accesses that were coalesced, which was a) confusing, and b)
didn't accurately represent what was going on for the majority of the loop
iterations.
There's now a SOA variability class (in addition to uniform,
varying, and unbound variability); the SOA factor must be a
positive power of 2.
When applied to a type, the leaf elements of the type (i.e.
atomic types, pointer types, and enum types) are widened out
into arrays of the given SOA factor. For example, given
struct Point { float x, y, z; };
Then "soa<8> Point" has a memory layout of "float x[8], y[8],
z[8]".
Furthermore, array indexing syntax has been augmented so that
when indexing into arrays of SOA-variability data, the two-stage
indexing (first into the array of soa<> elements and then into
the leaf arrays of SOA data) is performed automatically.
Previously, we uniqued AtomicTypes, so that they could be compared
by pointer equality, but with forthcoming SOA variability changes,
this would become too unwieldy (lacking a more general / ubiquitous
type uniquing implementation.)
Now, if a struct member has an explicit 'uniform' or 'varying'
qualifier, then that member has that variability, regardless of
the variability of the struct's variability. Members without
'uniform' or 'varying' have unbound variability, and in turn
inherit the variability of the struct.
As a result of this, now structs can properly be 'varying' by default,
just like all the other types, while still having sensible semantics.
Previously, when we had a switch statement with a uniform switch condition
but a 'break' statement that was under varying control flow inside the
switch, we'd promote the switch condition to be varying so that the
break would work correctly.
Now, we leave the condition as uniform and are thus able to use the
more-efficient LLVM switch instruction in this case.
Issue #156.
Specialize the code for the innermost loop to not do any masking
computations for the innermost dimension for the iterations where
we are certainly working on a full vector's worth of data.
This fix improves performance/code quality of "foreach" such that
it's essentially the same as the equivalent "for" loop.
Fixes issue #151.