For now this target just uses the same builtins-*.ll files as the
regular AVX1 target. Once the gather intrinsic is available from
LLVM, we'll want to have custom target files that call out to that
for gathers. (The integer min/max intrinsics should be wired up to
the __{min,max}_varying_{int,uint}*() builtins at that point as
well.)
Previously, they all went into one big pile that was never cleaned up;
this was the wrong thing to do in a world where one might have a
function declaration inside another functions, say.
Allow atomic types to be initialized with single-element expression lists:
int x = { 5 };
Issue an error if a storage class is provided with a function parameter.
Issue an error if two members of a struct have the same name.
Issue an error on trying to assign to a struct with a const member, even if
the struct itself isn't const.
Issue an error if a function is redefined.
Issue an error if a function overload is declared that differs only in return
type from a previously-declared function.
Issue an error if "inline" or "task" qualifiers are used outside of function
declarations.
Allow trailing ',' at the end of enumerator lists.
Multiple tests for all of the above.
Pointers can be either uniform or varying, and behave correspondingly.
e.g.: "uniform float * varying" is a varying pointer to uniform float
data in memory, and "float * uniform" is a uniform pointer to varying
data in memory. Like other types, pointers are varying by default.
Pointer-based expressions, & and *, sizeof, ->, pointer arithmetic,
and the array/pointer duality all bahave as in C. Array arguments
to functions are converted to pointers, also like C.
There is a built-in NULL for a null pointer value; conversion from
compile-time constant 0 values to NULL still needs to be implemented.
Other changes:
- Syntax for references has been updated to be C++ style; a useful
warning is now issued if the "reference" keyword is used.
- It is now illegal to pass a varying lvalue as a reference parameter
to a function; references are essentially uniform pointers.
This case had previously been handled via special case call by value
return code. That path has been removed, now that varying pointers
are available to handle this use case (and much more).
- Some stdlib routines have been updated to take pointers as
arguments where appropriate (e.g. prefetch and the atomics).
A number of others still need attention.
- All of the examples have been updated
- Many new tests
TODO: documentation
Added support for resolving dimensions of multi-dimensional unsized arrays
from their initializer exprerssions (previously, only the first dimension
would be resolved.)
Added checks to make sure that no unsized array dimensions remain after
doing this (except for the first dimensision of array parameters to
functions.)
Substantial improvements and generalizations to the parsing and
declaration handling code to properly parse declarations involving
pointers. (No change to user-visible functionality, but this
lays groundwork for supporting a more general pointer model.)
Be better about tracking the full extent of expressions in the parser;
this leads to more intelligible error messages when we indicate where
exactly the error happened.
The Expr::TypeConv() method has been replaced with both a
CanConvertTypes() routine that indicates whether one type
can be converted to another and a TypeConvertExpr()
routine that provides the same functionality as
Expr::TypeConv() used to.
The stuff in decl.h/decl.cpp is messy, largely due to its close mapping
to C-style variable declarations. This checkin has updated code throughout
all of the declaration statement, variable, and function code that operates
on symbols and types directly. Thus, Decl* related stuff is now localized
to decl.h/decl.cpp and the parser.
Issue #13.
Added AST and Function classes.
Now, we parse the whole file and build up the AST for all of the
functions in the Module before we emit IR for the functions (vs. before,
when we generated IR along the way as we parsed the source file.)
If a flag along the lines of "--target=sse4,avx-x2" is provided on the command-line,
then the program will be compiled for each of the given targets, with a separate
output file generated for each one. Further, an output file with dispatch functions
that check the current system's CPU and then chooses the best available variant
is also created.
Issue #11.
Within each function that launches tasks, we now can easily track which
tasks that function launched, so that the sync at the end of the function
can just sync on the tasks launched by that function (not all tasks
launched by all functions.)
Implementing this led to a rework of the task system API that ispc generates
code to call; the example task systems in examples/tasksys.cpp have been
updated to conform to this API. (The updated API is also documented in
the ispc user's guide.)
As part of this, "launch[n]" syntax was added to launch a number of tasks
in a single launch statement, rather than requiring a loop over 'n' to
launch n tasks.
This commit thus fixes issue #84 (enhancement to launch multiple tasks from
a single launch statement) as well as issue #105 (recursive task launches
were broken).
Go back to running both sides of 'if' statements with masking and without
branching if we can determine that the code is relatively simple (as per
the simple cost model), and is safe to run even if the mask is 'all off'.
This gives a bit of a performance improvement for some of the examples
(most notably, the ray tracer), and is the code that one wants generated
in this case anyhow.
This is currently only used to decide whether it's worth doing an
"are all lanes running" check at the start of functions--for small
functions, it's not worth the overhead.
The cost is estimated relatively early in compilation (e.g. before
we know if an array access is a scatter/gather or not, before
constant folding, etc.), so there are many known shortcomings.
If no CPU is specified, use the host CPU type, not just a default of "nehalem".
Provide better features strings to the LLVM target machinery.
-> Thus ensuring that LLVM doesn't generate SSE>2 instructions for the SSE2
target (Fixes issue #82).
-> Slight code improvements from using cmovs in generated code now
Use the llvm popcnt intrinsic for the SSE2 target now (it now generates code
that doesn't call the popcnt instruction now that we properly tell LLVM
which instructions are and aren't available for SSE2.)
Set the Module's target appropriately when it's first created.
Compile separate 32 and 64 bit versions of the builtins-c bitcocde
and load the appropriate one based on the target we're compiling
for.
Fixes issue #77. Previously, it dumped out the entire module every time
a new function was defined, which got to be quite a lot of output by
the time the stdlib functions were all added!
- In the ispc-generated header files, a #define now indicates which compilation target
was used.
- The examples use utility routines from the new file examples/cpuid.h to check the
system's CPU's capabilities to see if it supports the ISA that was used for
compiling the example code and print error messages if things aren't going to
work...