We now have a set of template functions CastType<AtomicType>, etc., that in
turn use a new typeId field in each Type instance, allowing them to be inlined
and to be quite efficient.
This improves front-end performance for a particular large program by 28%.
Before, if the function was declared before being defined, then the symbol's
SourcePos would be left set to the position of the declaration. This ended
up getting the debugging symbols mixed up in this case, which was undesirable.
Debugging information for functions that are inlined or static and
not used still hangs around after compilation; now we go through the
debugging info and remove the entries for any DISubprograms that
don't have their original functions left in the Module after
optimization.
Now a declaration like 'struct Foo;' can be used to establish the
name of a struct type, without providing a definition. One can
pass pointers to such types around the system, but can't do much
else with them (as in C/C++).
Issue #125.
The decl.* code now no longer interacts with Symbols, but just returns
names, types, initializer expressions, etc., as needed. This makes the
code a bit more understandable.
Fixes issues #171 and #130.
When reporting that a function has illegally been overloaded only
by return type, include "task", "export", and "extern "C"", as appropriate
in the error message to make clear what the issue is.
Finishes issue #216.
This actually wasn't a good idea, since we'd like ispc programs to be able to
have varying globals that it uses internally among ispc code, without having
errors about varying globals when generating headers.
Issue #214.
When we have an "extern" global, now we no longer inadvertently define
storage for it. Further, we now successfully do define storage when we
encounter a definition following one or more extern declarations.
Issues #215 and #217.
Rather than explicitly building a DAG and doing a topological sort,
just traverse structs recursively and emit declarations for all of
their dependent structs before emitting the original struct declaration.
Not only is this simpler than the previous implementation, but it
fixes a bug where we'd hit an assert if we had a struct with multiple
contained members of another struct type.
In short, we inadvertently weren't checking whether pointers themselves
were varying, which in turn led to an assertion later if an exported
function did have a varying parameter.
Issue #187.
There's now a SOA variability class (in addition to uniform,
varying, and unbound variability); the SOA factor must be a
positive power of 2.
When applied to a type, the leaf elements of the type (i.e.
atomic types, pointer types, and enum types) are widened out
into arrays of the given SOA factor. For example, given
struct Point { float x, y, z; };
Then "soa<8> Point" has a memory layout of "float x[8], y[8],
z[8]".
Furthermore, array indexing syntax has been augmented so that
when indexing into arrays of SOA-variability data, the two-stage
indexing (first into the array of soa<> elements and then into
the leaf arrays of SOA data) is performed automatically.
Previously, we uniqued AtomicTypes, so that they could be compared
by pointer equality, but with forthcoming SOA variability changes,
this would become too unwieldy (lacking a more general / ubiquitous
type uniquing implementation.)
Now, if a struct member has an explicit 'uniform' or 'varying'
qualifier, then that member has that variability, regardless of
the variability of the struct's variability. Members without
'uniform' or 'varying' have unbound variability, and in turn
inherit the variability of the struct.
As a result of this, now structs can properly be 'varying' by default,
just like all the other types, while still having sensible semantics.
Add a number of additional error cases in the grammar.
Enable bison's extended error reporting, to get better messages about the
context of errors and the expected (but not found) tokens at errors.
Improve the printing of these by providing an implementation of yytnamerr
that rewrites things like "TOKEN_MUL_ASSIGN" to "*=" in error messages.
Print the source location (using Error() when yyerror() is called; wiring
this up seems to require no longer building a 'pure parser' but having
yylloc as a global, which in turn led to having to update all of the uses of
it (which previously accessed it as a pointer).
Updated a number of tests_errors for resulting changesin error text.
Don't let the preprocessor remove comments anymore, so that the rules
in lex.ll can handle them. Fix lCComment() to update the source
position as it eats characters in comments.
The compiler now supports an --emit-c++ option, which generates generic
vector C++ code. To actually compile this code, the user must provide
C++ code that implements a variety of types and operations (e.g. adding
two floating-point vector values together, comparing them, etc).
There are two examples of this required code in examples/intrinsics:
generic-16.h is a "generic" 16-wide implementation that does all required
with scalar math; it's useful for demonstrating the requirements of the
implementation. Then, sse4.h shows a simple implementation of a SSE4
target that maps the emitted function calls to SSE intrinsics.
When using these example implementations with the ispc test suite,
all but one or two tests pass with gcc and clang on Linux and OSX.
There are currently ~10 failures with icc on Linux, and ~50 failures with
MSVC 2010. (To be fixed in coming days.)
Performance varies: when running the examples through the sse4.h
target, some have the same performance as when compiled with --target=sse4
from ispc directly (options), while noise is 12% slower, rt is 26%
slower, and aobench is 2.2x slower. The details of this haven't yet been
carefully investigated, but will be in coming days as well.
Issue #92.
When used, these targets end up with calls to undefined functions for all
of the various special vector stuff ispc needs to compile ispc programs
(masked store, gather, min/max, sqrt, etc.).
These targets are not yet useful for anything, but are a step toward
having an option to C++ code with calls out to intrinsics.
Reorganized the directory structure a bit and put the LLVM bitcode used
to define target-specific stuff (as well as some generic built-ins stuff)
into a builtins/ directory.
Note that for building on Windows, it's now necessary to set a LLVM_VERSION
environment variable (with values like LLVM_2_9, LLVM_3_0, LLVM_3_1svn, etc.)
Specifically, stmts and exprs are no longer responsible for first recursively
optimizing their children before doing their own optimization (this turned
out to be error-prone, with children sometimes being forgotten.) They now
are just responsible for their own optimization, when appropriate.
For now this target just uses the same builtins-*.ll files as the
regular AVX1 target. Once the gather intrinsic is available from
LLVM, we'll want to have custom target files that call out to that
for gathers. (The integer min/max intrinsics should be wired up to
the __{min,max}_varying_{int,uint}*() builtins at that point as
well.)
Previously, they all went into one big pile that was never cleaned up;
this was the wrong thing to do in a world where one might have a
function declaration inside another functions, say.
Allow atomic types to be initialized with single-element expression lists:
int x = { 5 };
Issue an error if a storage class is provided with a function parameter.
Issue an error if two members of a struct have the same name.
Issue an error on trying to assign to a struct with a const member, even if
the struct itself isn't const.
Issue an error if a function is redefined.
Issue an error if a function overload is declared that differs only in return
type from a previously-declared function.
Issue an error if "inline" or "task" qualifiers are used outside of function
declarations.
Allow trailing ',' at the end of enumerator lists.
Multiple tests for all of the above.
Pointers can be either uniform or varying, and behave correspondingly.
e.g.: "uniform float * varying" is a varying pointer to uniform float
data in memory, and "float * uniform" is a uniform pointer to varying
data in memory. Like other types, pointers are varying by default.
Pointer-based expressions, & and *, sizeof, ->, pointer arithmetic,
and the array/pointer duality all bahave as in C. Array arguments
to functions are converted to pointers, also like C.
There is a built-in NULL for a null pointer value; conversion from
compile-time constant 0 values to NULL still needs to be implemented.
Other changes:
- Syntax for references has been updated to be C++ style; a useful
warning is now issued if the "reference" keyword is used.
- It is now illegal to pass a varying lvalue as a reference parameter
to a function; references are essentially uniform pointers.
This case had previously been handled via special case call by value
return code. That path has been removed, now that varying pointers
are available to handle this use case (and much more).
- Some stdlib routines have been updated to take pointers as
arguments where appropriate (e.g. prefetch and the atomics).
A number of others still need attention.
- All of the examples have been updated
- Many new tests
TODO: documentation
Added support for resolving dimensions of multi-dimensional unsized arrays
from their initializer exprerssions (previously, only the first dimension
would be resolved.)
Added checks to make sure that no unsized array dimensions remain after
doing this (except for the first dimensision of array parameters to
functions.)