Commit Graph

193 Commits

Author SHA1 Message Date
Matt Pharr
975db80ef6 Add support for pointers to the language.
Pointers can be either uniform or varying, and behave correspondingly.
e.g.: "uniform float * varying" is a varying pointer to uniform float
data in memory, and "float * uniform" is a uniform pointer to varying
data in memory.  Like other types, pointers are varying by default.

Pointer-based expressions, & and *, sizeof, ->, pointer arithmetic,
and the array/pointer duality all bahave as in C.  Array arguments
to functions are converted to pointers, also like C.

There is a built-in NULL for a null pointer value; conversion from
compile-time constant 0 values to NULL still needs to be implemented.

Other changes:
- Syntax for references has been updated to be C++ style; a useful
  warning is now issued if the "reference" keyword is used.
- It is now illegal to pass a varying lvalue as a reference parameter
  to a function; references are essentially uniform pointers.
  This case had previously been handled via special case call by value
  return code.  That path has been removed, now that varying pointers
  are available to handle this use case (and much more).
- Some stdlib routines have been updated to take pointers as
  arguments where appropriate (e.g. prefetch and the atomics).
  A number of others still need attention.
- All of the examples have been updated
- Many new tests

TODO: documentation
2011-11-27 13:09:59 -08:00
Matt Pharr
d3e6879223 Improve error checking for unsized arrays.
Added support for resolving dimensions of multi-dimensional unsized arrays
from their initializer exprerssions (previously, only the first dimension
would be resolved.)

Added checks to make sure that no unsized array dimensions remain after
doing this (except for the first dimensision of array parameters to
functions.)
2011-11-21 10:41:23 -08:00
Matt Pharr
7290f7b16b Generalize/improve parsing of pointer declarations.
Substantial improvements and generalizations to the parsing and
declaration handling code to properly parse declarations involving
pointers.  (No change to user-visible functionality, but this
lays groundwork for supporting a more general pointer model.)
2011-11-14 08:45:55 -08:00
Matt Pharr
ba9bb3338f Add tests for function pointers. 2011-11-03 16:14:15 -07:00
Matt Pharr
43a2d510bf Incorporate per-lane offsets for varying data in the front-end.
Previously, it was only in the GatherScatterFlattenOpt optimization pass that
we added the per-lane offsets when we were indexing into varying data.
(Specifically, the case of float foo[]; int index; foo[index], where foo
is an array of varying elements rather than uniform elements.)  Now, this
is done in the front-end as we're first emitting code.

In addition to the basic ugliness of doing this in an optimization pass, 
it was also error-prone to do it there, since we no longer have access
to all of the type information that's around in the front-end.

No functionality or performance change.
2011-11-03 13:15:07 -07:00
Matt Pharr
422b8268a9 Add assert() statement support. Issue #106. 2011-10-15 13:50:05 -07:00
Matt Pharr
9f2aa8d92a Handle ConstantExpressions when computing address+offset vectors for scatter/gather.
In particular, this fixes issue #81, where a global variable access was leading to
ConstantExpressions showing up in this code, which it wasn't previously expecting.
2011-10-14 11:20:08 -07:00
Matt Pharr
2460fa5c83 Improve gather/scatter optimization passes to handle loops better.
Specifically, now we can work through phi nodes in the IR to detect cases
where an index value is actually the same across lanes or is linear across
the lanes.  For example, this is a loop that used to require gathers but
is now turned into vector loads:

    for (int i = programIndex; i < 16; i += programCount)
        sum += a[i];

Fixes issue #107.
2011-10-13 17:01:25 -07:00
Matt Pharr
88e317f1a9 These tests now pass with LLVM ToT 2011-10-11 16:17:50 -07:00
Matt Pharr
1198520029 Improve gather->vector load optimization to detect <linear sequence>-<uniform> case.
Previously, we didn't handle subtraction ops when deciphering offsets in order to
try to change gathers t evictor loads.
2011-10-11 13:24:40 -07:00
Matt Pharr
ecda4561bd Move some tests that now pass with LLVM 3.0 from failing_tests to tests/ 2011-10-10 11:51:47 -07:00
Matt Pharr
3cb0115dce Add routines to standard library to do efficient AOS/SOA conversions.
Currently, we just support 3 and 4-wide variants (i.e. xyzxyz.. and xyzwxyzw..),
for int32 and float types.
2011-10-10 10:56:06 -07:00
Matt Pharr
cb7976bbf6 Added updated task launch implementation that now tracks task groups.
Within each function that launches tasks, we now can easily track which
tasks that function launched, so that the sync at the end of the function
can just sync on the tasks launched by that function (not all tasks
launched by all functions.)

Implementing this led to a rework of the task system API that ispc generates
code to call; the example task systems in examples/tasksys.cpp have been
updated to conform to this API.  (The updated API is also documented in
the ispc user's guide.)

As part of this, "launch[n]" syntax was added to launch a number of tasks
in a single launch statement, rather than requiring a loop over 'n' to
launch n tasks.

This commit thus fixes issue #84 (enhancement to launch multiple tasks from
a single launch statement) as well as issue #105 (recursive task launches
were broken).
2011-09-30 11:20:53 -07:00
Matt Pharr
aad269fdf4 Added support for 'uniform' global atomics.
Issue #93.
2011-09-28 16:06:07 -07:00
Matt Pharr
6734021520 Issue warning when compile-time constant out-of-bounds array index is used.
Issue #98.
Also fixes two examples that had bugs of this type that this warning
  uncovered!
2011-09-13 14:42:20 -07:00
Matt Pharr
dd153d3c5c Handle more instruction types when flattening offset vectors.
Generalize the lScalarizeVector() utility routine (used in determining
when we can change gathers/scatters into vector loads/stores, respectively)
to handle vector shuffles and vector loads.  This fixes issue #79, which
provided a case where a gather was being performed even though a vector
load was possible.
2011-09-13 09:43:56 -07:00
Matt Pharr
4f451bd041 More AVX fixes
Fix RNG state initialization for 16-wide targets
Fix a number of bugs in reduce_add builtin implementations for AVX.
Fix some tests that had incorrect expected results for the 16-wide
  case.
2011-09-06 15:53:11 -07:00
Matt Pharr
9cd92facbd Fix test: was incorrectly failing for 8-wide targets 2011-09-01 05:03:49 -07:00
Matt Pharr
e144724979 Improve performance of global atomics, taking advantage of associativity.
For associative atomic ops (add, and, or, xor), we can take advantage of
their associativity to do just a single hardware atomic instruction, 
rather than one for each of the running program instances (as the previous
implementation did.)

The basic approach is to locally compute a reduction across the active
program instances with the given op and to then issue a single HW atomic
with that reduced value as the operand.  We then take the old value that
was stored in the location that is returned from the HW atomic op and
use that to compute the values to return to each of the program instances
(conceptually representing the cumulative effect of each of the preceding
program instances having performed their atomic operation.)

Issue #56.
2011-08-31 05:35:01 -07:00
Matt Pharr
d0db46aac5 Use logical shift right op for shifts of unsigned ints. Fixes issue #88. 2011-08-29 10:32:26 -07:00
Matt Pharr
84e586e767 Commit correct atomics tests 2011-08-26 10:43:30 -07:00
Matt Pharr
606cbab0d4 Performance improvements for global min/max atomics. Issue #57.
Compute a "local" min/max across the active program instances and 
  then do a single atomic memory op.
Added a few tests to exercise global min/max atomics (which were
  previously untested!)
2011-08-26 10:35:24 -07:00
Matt Pharr
8c921544a0 fix broken test 2011-08-18 20:40:50 +01:00
Matt Pharr
f868a63064 Add support for scan operations across program instances (add, and, or). 2011-08-13 20:11:41 +01:00
Matt Pharr
8c534d4d74 Add reduce_equal() function to standard library. 2011-08-10 15:55:55 -07:00
Matt Pharr
a5a133ccce Do more iterations of RNG test to let result converge to bounds. 2011-08-03 13:44:49 -07:00
Matt Pharr
467f1e71d7 Add fast versions of the float<-->half conversion routines in the stdlib.
These get slightly wrong results for zero and the denorms and also
don't handle the Inf/NaN stuff correctly, but are much more efficient
than the full versions of these routines.
2011-08-03 15:58:42 +01:00
Matt Pharr
a2996ed5d9 More efficient implementation of frandom() in stdlib 2011-08-03 14:28:06 +01:00
Matt Pharr
158bd6ef9e Fix bug with initializer expression lists for globlal/static array-typed variables. 2011-07-28 11:38:56 +01:00
Pete Couperus
59036cdf5b Add support for multi-element vector swizzles. Issue #17.
This commit adds support for swizzles like "foo.zy" (if "foo" is,
for example, a float<3> type) as rvalues.  (Still need support for
swizzles as lvalues.)
2011-07-22 13:10:14 +01:00
Matt Pharr
8ef3df57c5 Add support for in-memory half float data. Fixes issue #10 2011-07-21 15:55:45 +01:00
Matt Pharr
bba7211654 Add support for int8/int16 types. Addresses issues #9 and #42. 2011-07-21 06:57:40 +01:00
Matt Pharr
f0f876c3ec Add support for enums. 2011-07-17 16:43:05 +02:00
Matt Pharr
a535aa586b Fix issue #2: use zero extend to convert bool->int, not sign extend.
This way, we match C/C++ in that casting a bool to an int gives either the value
zero or the value one.  There is a new stdlib function int sign_extend(bool)
that does sign extension for cases where that's desired.
2011-07-12 13:30:05 +01:00
Matt Pharr
5a53a43ed0 Finish support for 64-bit types in stdlib. Fixes issue #14.
Add much more suppport for doubles and in64 types in the standard library, basically supporting everything for them that are supported for floats and int32s.  (The notable exceptions being the approximate rcp() and rsqrt() functions, which don't really have sensible analogs for doubles (or at least not built-in instructions).)
2011-07-07 13:25:55 +01:00
Pete Couperus
126e065601 Merge from petecoup/shortvec-in-struct branch.
Fixes issue #49: using short vector types in struct declarations
would give a bogus parse error.
2011-07-06 09:07:51 +01:00
Matt Pharr
5bcc611409 Implement global atomics and a memory barrier in the standard library.
This checkin provides the standard set of atomic operations and a memory barrier in the ispc standard library.  Both signed and unsigned 32- and 64-bit integer types are supported.
2011-07-04 17:20:42 +01:00
Matt Pharr
c14c3ceba6 Provide both signed and unsigned int variants of bitcode-based builtins.
When creating function Symbols for functions that were defined in LLVM bitcode for the standard library, if any of the function parameters are integer types, create two ispc-side Symbols: one where the integer types are all signed and the other where they are all unsigned.  This allows us to provide, for example, both store_to_int16(reference int a[], uniform int offset, int val) as well as store_to_int16(reference unsigned int a[], uniform int offset, unsigned int val). functions.

Added some additional tests to exercise the new variants of these.

Also fixed some cases where the __{load,store}_int{8,16} builtins would read from/write to memory even if the mask was all off (which could cause crashes in some cases.)
2011-07-04 12:10:26 +01:00
Matt Pharr
fe7717ab67 Added shuffle() variant to the standard library that takes two
varying values and a permutation index that spans the concatenation
of the two of them (along the lines of SHUFPS...)
2011-07-02 08:43:35 +01:00
Matt Pharr
d2d5858be1 It is no longer legal to initialize arrays and structs with single
scalar values (that ispc used to smear across the array/struct
elements).  Now, initializers in variable declarations must be
{ }-delimited lists, with one element per struct member or array
element, respectively.

There were a few problems with the previous implementation of the
functionality to initialize from scalars.  First, the expression
would be evaluated once per value initialized, so if it had side-effects,
the wrong thing would happen.  Next, for large multidimensional arrays,
the generated code would be a long series of move instructions, rather
than loops (and this in turn made LLVM take a long time.)

While both of these problems are fixable, it's a non-trivial
amount of re-plumbing for a questionable feature anyway.

Fixes issue #50.
2011-07-01 13:45:58 +01:00
Matt Pharr
2709c354d7 Add support for broadcast(), rotate(), and shuffle() stdlib routines 2011-06-27 17:31:44 -07:00
Matt Pharr
aaafdf80f2 Move two tests that are currently failing into failing_tests/ 2011-06-22 05:28:23 -07:00
Matt Pharr
18af5226ba Initial commit. 2011-06-21 12:48:50 -07:00