Commit Graph

54 Commits

Author SHA1 Message Date
Matt Pharr
0432f97555 Fix build with LLVM 3.1 TOT 2012-01-31 14:10:07 -08:00
Matt Pharr
664dc3bdda Add support for "new" and "delete" to the language.
Issue #139.
2012-01-27 14:47:06 -08:00
Matt Pharr
748b292e77 Improve code for uniform switches with a 'break' under varying control flow.
Previously, when we had a switch statement with a uniform switch condition
but a 'break' statement that was under varying control flow inside the
switch, we'd promote the switch condition to be varying so that the
break would work correctly.

Now, we leave the condition as uniform and are thus able to use the
more-efficient LLVM switch instruction in this case.

Issue #156.
2012-01-19 08:41:19 -07:00
Matt Pharr
7045b76f84 Improvements to code generation for "foreach"
Specialize the code for the innermost loop to not do any masking
computations for the innermost dimension for the iterations where
we are certainly working on a full vector's worth of data.

This fix improves performance/code quality of "foreach" such that
it's essentially the same as the equivalent "for" loop.

Fixes issue #151.
2012-01-17 11:34:00 -08:00
Matt Pharr
b67446d998 Add support for "switch" statements.
Switches with both uniform and varying "switch" expressions are
supported.  Switch statements with varying expressions and very
large numbers of labels may not perform well; some issues to be
filed shortly will track opportunities for improving these.
2012-01-11 09:16:31 -08:00
Matt Pharr
78c6d3c02f Add initial support for 'goto' statements.
ispc now supports goto, but only under uniform control flow--i.e.
it must be possible for the compiler to statically determine that
all program instances will follow the goto.  An error is issued at
compile time if a goto is used when this is not the case.
2012-01-05 12:22:36 -08:00
Matt Pharr
4151778f5e Modify SizeOf() and StructOffset() to not compute value based on target for generic targets.
Specifically, we want to be able to late-bind on whether the mask is i32s or i1s, so if there's
any chance of ambiguity, we emit code that does the "GEP from a NULL base pointer" trick to
compute the value later in compilation.
2012-01-04 12:59:03 -08:00
Matt Pharr
848a432640 Fix various small things that were broken with single-bit-per-lane masks.
Also small cleanups to declarations, "no captures" added, etc.
2012-01-04 12:59:03 -08:00
Matt Pharr
1d9201fe3d Add "generic" 4, 8, and 16-wide targets.
When used, these targets end up with calls to undefined functions for all
of the various special vector stuff ispc needs to compile ispc programs
(masked store, gather, min/max, sqrt, etc.).

These targets are not yet useful for anything, but are a step toward
having an option to C++ code with calls out to intrinsics.

Reorganized the directory structure a bit and put the LLVM bitcode used
to define target-specific stuff (as well as some generic built-ins stuff)
into a builtins/ directory.

Note that for building on Windows, it's now necessary to set a LLVM_VERSION
environment variable (with values like LLVM_2_9, LLVM_3_0, LLVM_3_1svn, etc.)
2011-12-19 13:46:50 -08:00
Matt Pharr
8d1b77b235 Have assertion macro and FATAL() text ask user to file a bug, provide URL to do so.
Switch to Assert() from assert() to make it clear it's not the C stdlib one we're
using any more.
2011-12-15 11:11:16 -08:00
Matt Pharr
10ebe88abf Directly emit code for the mask checks at the start of complex functions.
Previously, we used an IfStmt to wrap complex functions with the equivalent
of a "cif" to check to see if the mask was all on, all off, or mixed at the
start of executing non-trivial functions.  This had the unintended side
effect of suggesting to other parts of the compiler that the entire function
was under varying control flow (which in turn led to some small code
quality issues.)

Now, we emit the equivalent code directly.
2011-12-15 06:00:41 -08:00
Matt Pharr
9920b30318 Fix bug that led to incorrect code with return statements.
The conceptual error was the assumption that not being under varying
control flow implied that the mask was all on; this is not the case
if some of the instances have executed a return earlier in the function's
execution.  The error in practice would be that the mask would be
assumed to be all-on for things like memory writes, so there would
be unintended side-effects for the instances that had returned.
2011-12-15 06:00:31 -08:00
Matt Pharr
6f26ae9801 Fix bugs with offsetting for varying values with gathers/scatters.
Fixes issue #134.
2011-12-12 14:13:46 -08:00
Matt Pharr
5b48354d9a Fix crashes from malformed programs. 2011-12-12 13:47:46 -08:00
Matt Pharr
46bfef3fce Add option to turn off codegen improvements when mask 'all on' is statically known. 2011-12-11 16:16:36 -08:00
Matt Pharr
f6605ee465 Small cleanup: allocate storage for the full mask in the FunctionEmitContext constructor 2011-12-10 13:33:28 -08:00
Matt Pharr
198aa9620e Fix bug with mask used for gather/scatter code generation.
We should always use the full mask for this, never the internal mask.
Added tests for this.
2011-12-06 15:51:56 -08:00
Matt Pharr
f95504fb5e Symbol table now properly handles scopes for function declarations.
Previously, they all went into one big pile that was never cleaned up;
this was the wrong thing to do in a world where one might have a 
function declaration inside another functions, say.
2011-12-04 17:37:13 -08:00
Matt Pharr
a1c0b4f95a Allow 'continue' statements in 'foreach' loops. 2011-12-03 09:31:02 -08:00
Matt Pharr
8bc7367109 Add foreach and foreach_tiled looping constructs
These make it easier to iterate over arbitrary amounts of data
elements; specifically, they automatically handle the "ragged
extra bits" that come up when the number of elements to be
processed isn't evenly divided by programCount.

TODO: documentation
2011-11-30 13:17:31 -08:00
Matt Pharr
7a2561c429 Add count_{leading,trailing}_zeros() functions to stdlib.
(Documentation is still yet to be written.)
2011-11-30 10:12:16 -08:00
Matt Pharr
b1ae307163 Fix bug in FunctionEmitContext::SyncInst()
The launch group handle is now reset to NULL after sync is called;
this ensures that if tasks are launched in the same function after
a sync, that the ISPCAlloc() call for the next launch will be
passed a NULL handle (as it should be).
2011-11-29 13:26:48 -08:00
Matt Pharr
e52104ff55 Pointer fixes/improvements.
Allow <, <=, >, >= comparisons of pointers
Allow explicit type-casting of pointers to and from integers
Fix bug in handling expressions of the form "int + ptr" ("ptr + int"
  was fine).
Fix a bug in TypeCastExpr where varying -> uniform typecasts
  would be allowed (leading to a crash later)
2011-11-29 13:22:36 -08:00
Matt Pharr
975db80ef6 Add support for pointers to the language.
Pointers can be either uniform or varying, and behave correspondingly.
e.g.: "uniform float * varying" is a varying pointer to uniform float
data in memory, and "float * uniform" is a uniform pointer to varying
data in memory.  Like other types, pointers are varying by default.

Pointer-based expressions, & and *, sizeof, ->, pointer arithmetic,
and the array/pointer duality all bahave as in C.  Array arguments
to functions are converted to pointers, also like C.

There is a built-in NULL for a null pointer value; conversion from
compile-time constant 0 values to NULL still needs to be implemented.

Other changes:
- Syntax for references has been updated to be C++ style; a useful
  warning is now issued if the "reference" keyword is used.
- It is now illegal to pass a varying lvalue as a reference parameter
  to a function; references are essentially uniform pointers.
  This case had previously been handled via special case call by value
  return code.  That path has been removed, now that varying pointers
  are available to handle this use case (and much more).
- Some stdlib routines have been updated to take pointers as
  arguments where appropriate (e.g. prefetch and the atomics).
  A number of others still need attention.
- All of the examples have been updated
- Many new tests

TODO: documentation
2011-11-27 13:09:59 -08:00
Matt Pharr
068ea3e4c4 Better SourcePos reporting for gathers/scatters 2011-11-21 10:26:53 -08:00
Matt Pharr
f8eb100c60 Use llvm TargetData to find object sizes, offsets.
Previously, to compute the size of objects and the offsets of struct
elements within structs, we were using the trick of using getelementpointer 
with a NULL base pointer and then casting the result to an int32/64.
However, since we actually know the target we're compiling for at
compile time, we can use corresponding methods from TargetData to
get these values directly.

This mostly cleans up code, but may make some of the gather/scatter
lowering to loads/stores optimizations work better in the presence
of structures.
2011-11-06 19:31:19 -08:00
Matt Pharr
b0d476fcdc Stop zero-initializing memory used to store return values.
This seems to have a noticable (small) performance benefit on a
few of the example workloads.
2011-11-05 09:49:44 -07:00
Matt Pharr
afcd42028f Add support for function pointers.
Both uniform and varying function pointers are supported; when a function
is called through a varying function pointer, each unique function pointer
value across the running program instances is called once for the set of
active program instances that want to call it.
2011-11-03 16:14:14 -07:00
Matt Pharr
d528533fba Add FunctionEmitContext::SmearScalar() method (and use it). 2011-11-03 16:14:14 -07:00
Matt Pharr
43a2d510bf Incorporate per-lane offsets for varying data in the front-end.
Previously, it was only in the GatherScatterFlattenOpt optimization pass that
we added the per-lane offsets when we were indexing into varying data.
(Specifically, the case of float foo[]; int index; foo[index], where foo
is an array of varying elements rather than uniform elements.)  Now, this
is done in the front-end as we're first emitting code.

In addition to the basic ugliness of doing this in an optimization pass, 
it was also error-prone to do it there, since we no longer have access
to all of the type information that's around in the front-end.

No functionality or performance change.
2011-11-03 13:15:07 -07:00
Matt Pharr
e009c0a61d Be able to determine if two types can be converted without requiring an Expr *.
The Expr::TypeConv() method has been replaced with both a
CanConvertTypes() routine that indicates whether one type
can be converted to another and a TypeConvertExpr()
routine that provides the same functionality as
Expr::TypeConv() used to.
2011-10-30 14:12:12 -07:00
Matt Pharr
074cbc2716 Fix #ifdefs to catch LLVM 3.1svn now as well 2011-10-19 14:01:19 -07:00
Matt Pharr
290032f4f5 Be more careful about using the right mask when emitting gathers.
Specifically, we had been using the full mask for all gathers, rather than
using the internal mask when we were loading from locally-declared arrays.
Thus, given code like:

  uniform float x[programCount] = { .. . };
  float xx = x[programIndex];

Previously we weren't generating a plain vector load to initialize xx, when
this code was in a function where it wasn't known that the mask was all on,
even though it should have.  Now it does.
2011-10-17 20:25:52 -04:00
Matt Pharr
a89e26d725 Improvements to mask management code; removes a number of unnecessary blends.
We now maintain a the distinction between the value of the mask passed into a
function and the "internal" mask within the function that only accounts for
varying control flow within the function.

The full mask (the AND of the function mask and the internal mask) must be used
for assignments to static and global variables, and reference function parameters.
Further, it is the appropriate mask to use for making decisions about varying
control flow.  However, we can use the internal mask for assignments to variables
declared in the current function (including the return value and non-reference
parameters to the function).  Doing so allows us to catch a few more cases where
the internal mask is all on, even if the mask coming into the function wasn't all
on, and thence use moves rather than blends for those assignments.  (Which in
turn can allow additional optimizations to happen.)

Fixes issue #23.
2011-10-10 11:47:19 -07:00
Matt Pharr
f9c67ff806 Explicit representation of ASTs for all the functions in a compile unit.
Added AST and Function classes.
Now, we parse the whole file and build up the AST for all of the
  functions in the Module before we emit IR for the functions (vs. before,
  when we generated IR along the way as we parsed the source file.)
2011-10-06 15:35:27 -07:00
Matt Pharr
a6fc657b40 Remove 'externGlobals' member from Module; instead find them when needed via new SymbolTable::GetMatchingVariables method. 2011-10-04 06:36:31 -07:00
Matt Pharr
cb7976bbf6 Added updated task launch implementation that now tracks task groups.
Within each function that launches tasks, we now can easily track which
tasks that function launched, so that the sync at the end of the function
can just sync on the tasks launched by that function (not all tasks
launched by all functions.)

Implementing this led to a rework of the task system API that ispc generates
code to call; the example task systems in examples/tasksys.cpp have been
updated to conform to this API.  (The updated API is also documented in
the ispc user's guide.)

As part of this, "launch[n]" syntax was added to launch a number of tasks
in a single launch statement, rather than requiring a loop over 'n' to
launch n tasks.

This commit thus fixes issue #84 (enhancement to launch multiple tasks from
a single launch statement) as well as issue #105 (recursive task launches
were broken).
2011-09-30 11:20:53 -07:00
Matt Pharr
2405dae8e6 Use malloc() to get space for task arguments when compiling to AVX.
This is to work around the LLVM bug/limitation discused in LLVM bug
10841 (http://llvm.org/bugs/show_bug.cgi?id=10841).
2011-09-17 13:38:51 -07:00
Matt Pharr
3607f3e045 Remove support for building with LLVM 2.8. Fixes issue #66.
Both 2.9 and top-of-tree generate substantially better code than
LLVM 2.8 did, so it's not worth fixing the 2.8 build.
2011-09-17 13:18:59 -07:00
Matt Pharr
1dedd88132 Improve implementaton of 'are both masks equal' check for AVX.
Previously, we did a vector equal compare and then a movmsk, the
result of which we checked to see if it was on for all lanes.
Because masks are vectors of i32s, under AVX, the vector equal
compare required two 4-wide SSE compares and some shuffling.
Now, we do a movmsk of both masks first and then a scalar
equality comparison of those two values, which seems to generate
overall better code.
2011-09-15 06:25:02 -07:00
Matt Pharr
922dbdec06 Fixes to build with LLVM top-of-tree 2011-07-26 10:57:49 +01:00
Matt Pharr
bba7211654 Add support for int8/int16 types. Addresses issues #9 and #42. 2011-07-21 06:57:40 +01:00
Matt Pharr
654cfb4b4b Many fixes for recent LLVM dev tree API changes 2011-07-18 15:54:39 +01:00
Matt Pharr
f0f876c3ec Add support for enums. 2011-07-17 16:43:05 +02:00
Matt Pharr
6e4c165c7e Use malloc to allocate storage for task parameters on Windows.
Fixes bug #55.  A number of tests were crashing on Windows due to the task
launch code using alloca to allocate space for the tasks' parameters.  On
Windows, the stack isn't generally big enough for this to be a good idea.
Also added an alignment parmaeter to ISPCMalloc() to pass the alignment
requirement along.
2011-07-06 05:53:25 -07:00
Matt Pharr
c14c3ceba6 Provide both signed and unsigned int variants of bitcode-based builtins.
When creating function Symbols for functions that were defined in LLVM bitcode for the standard library, if any of the function parameters are integer types, create two ispc-side Symbols: one where the integer types are all signed and the other where they are all unsigned.  This allows us to provide, for example, both store_to_int16(reference int a[], uniform int offset, int val) as well as store_to_int16(reference unsigned int a[], uniform int offset, unsigned int val). functions.

Added some additional tests to exercise the new variants of these.

Also fixed some cases where the __{load,store}_int{8,16} builtins would read from/write to memory even if the mask was all off (which could cause crashes in some cases.)
2011-07-04 12:10:26 +01:00
Matt Pharr
eb22fa6173 Generalize FunctionEmitContext::PtrToIntInst and IntToPtrInst to
do the right thing if given a varying lvalue (i.e. an array of
pointers).  Fixes issue #34.
2011-06-29 12:38:12 +01:00
Matt Pharr
6b153566f3 Simplify a bunch of code by using CollectionType to collect struct
codepaths in with array/vector codepaths. (Issue #37).
2011-06-29 07:59:43 +01:00
Matt Pharr
214fb3197a Initial plumbing to add CollectionType base-class as common ancestor
to StructTypes, ArrayTypes, and VectorTypes.  Issue #37.
2011-06-29 07:42:09 +01:00
Matt Pharr
ce7978ae74 Align stack-allocated arrays of uniform types to the target vector alignment (they will often be accessed in programCount-sized chunks and this should make that a bit more efficient in the common case). Fixes issue #15 2011-06-28 20:42:18 -07:00