The stuff in decl.h/decl.cpp is messy, largely due to its close mapping
to C-style variable declarations. This checkin has updated code throughout
all of the declaration statement, variable, and function code that operates
on symbols and types directly. Thus, Decl* related stuff is now localized
to decl.h/decl.cpp and the parser.
Issue #13.
Specifically, we had been using the full mask for all gathers, rather than
using the internal mask when we were loading from locally-declared arrays.
Thus, given code like:
uniform float x[programCount] = { .. . };
float xx = x[programIndex];
Previously we weren't generating a plain vector load to initialize xx, when
this code was in a function where it wasn't known that the mask was all on,
even though it should have. Now it does.
It's not clear that these are actually all that helpful.
This also works around issue #89, wherein code like "int8 = 0" would
give a warning about conversion from int32 to int8.
Generalize the overload resolution code to be based on estimating a
cost for various overload options and picking the one with the
minimal cost.
Add a step that considers type conversions that are guaranteed to
not lose information in function overload resolution.
Print better diagnostics when we can't find an unambiguous match.
In particular, this fixes issue #81, where a global variable access was leading to
ConstantExpressions showing up in this code, which it wasn't previously expecting.
Specifically, now we can work through phi nodes in the IR to detect cases
where an index value is actually the same across lanes or is linear across
the lanes. For example, this is a loop that used to require gathers but
is now turned into vector loads:
for (int i = programIndex; i < 16; i += programCount)
sum += a[i];
Fixes issue #107.
We now maintain a the distinction between the value of the mask passed into a
function and the "internal" mask within the function that only accounts for
varying control flow within the function.
The full mask (the AND of the function mask and the internal mask) must be used
for assignments to static and global variables, and reference function parameters.
Further, it is the appropriate mask to use for making decisions about varying
control flow. However, we can use the internal mask for assignments to variables
declared in the current function (including the return value and non-reference
parameters to the function). Doing so allows us to catch a few more cases where
the internal mask is all on, even if the mask coming into the function wasn't all
on, and thence use moves rather than blends for those assignments. (Which in
turn can allow additional optimizations to happen.)
Fixes issue #23.
Added AST and Function classes.
Now, we parse the whole file and build up the AST for all of the
functions in the Module before we emit IR for the functions (vs. before,
when we generated IR along the way as we parsed the source file.)
This fixes an issue with undefined SVML symbols with code that called
transcendental functions in the stdandard library, even when the SVML
math library hadn't been selected.
Makefile and vcxproj file updates.
Also modified vcxproj files so that the various files ispc generates go into $(TargetDir),
not the current directory.
Modified the ray tracer example to not have uniform short-vector types in its app-visible
datatypes (these are laid out differently on SSE vs AVX); there was an existing lurking
bug in the way this was done before.
If a flag along the lines of "--target=sse4,avx-x2" is provided on the command-line,
then the program will be compiled for each of the given targets, with a separate
output file generated for each one. Further, an output file with dispatch functions
that check the current system's CPU and then chooses the best available variant
is also created.
Issue #11.
Specifically, launch all of the tasks in one statement, rather than
still looping over spans in y and launching a collection of tasks
across x for each span. This seems to give a few percent better
performance.
Don't print more than 3 lines of source file context with errors.
(Any more than that is almost certainly not the Right Thing to do.)
Make some parsing error messages more clear.