Commit Graph

43 Commits

Author SHA1 Message Date
Matt Pharr
b48775a549 Handle global arrays better in varying pointer analysis.
Specifically, indexing into global arrays sometimes comes in as a big 
llvm::ConstantVector, so we need to handle traversing those as well when
we do the corresponding checks in GatherScatterFlattenOpt so that we
still detect cases where we can convert them into the base pointer +
offsets form that's used in later analysis.
2011-11-30 12:29:49 -08:00
Matt Pharr
e52104ff55 Pointer fixes/improvements.
Allow <, <=, >, >= comparisons of pointers
Allow explicit type-casting of pointers to and from integers
Fix bug in handling expressions of the form "int + ptr" ("ptr + int"
  was fine).
Fix a bug in TypeCastExpr where varying -> uniform typecasts
  would be allowed (leading to a crash later)
2011-11-29 13:22:36 -08:00
Matt Pharr
2a6e3e5fea Fix bug in ptr+offset decomposition in GatherScatterFlattenOpt
Given IR that encoded computation like "vec(4) + ptr2int(some pointer)",
we'd report that "int2ptr(4)" was the base pointer and the ptr2int 
value was the offset.  This in turn could lead to incorrect code
from LLVM, since we'd end up with GEP instructions where the first
operand was int2ptr(4) and the offset was the original pointer value.
This in turn was sometimes leading to incorrect code and thence a 
failure on the tests/gs-double-improve-multidim.ispc test since LLVM's
memory read/write analysis assumes that nothing after the first operand
of a GEP is actually a pointer.
2011-11-28 15:00:41 -08:00
Matt Pharr
975db80ef6 Add support for pointers to the language.
Pointers can be either uniform or varying, and behave correspondingly.
e.g.: "uniform float * varying" is a varying pointer to uniform float
data in memory, and "float * uniform" is a uniform pointer to varying
data in memory.  Like other types, pointers are varying by default.

Pointer-based expressions, & and *, sizeof, ->, pointer arithmetic,
and the array/pointer duality all bahave as in C.  Array arguments
to functions are converted to pointers, also like C.

There is a built-in NULL for a null pointer value; conversion from
compile-time constant 0 values to NULL still needs to be implemented.

Other changes:
- Syntax for references has been updated to be C++ style; a useful
  warning is now issued if the "reference" keyword is used.
- It is now illegal to pass a varying lvalue as a reference parameter
  to a function; references are essentially uniform pointers.
  This case had previously been handled via special case call by value
  return code.  That path has been removed, now that varying pointers
  are available to handle this use case (and much more).
- Some stdlib routines have been updated to take pointers as
  arguments where appropriate (e.g. prefetch and the atomics).
  A number of others still need attention.
- All of the examples have been updated
- Many new tests

TODO: documentation
2011-11-27 13:09:59 -08:00
Matt Pharr
068ea3e4c4 Better SourcePos reporting for gathers/scatters 2011-11-21 10:26:53 -08:00
Matt Pharr
f8eb100c60 Use llvm TargetData to find object sizes, offsets.
Previously, to compute the size of objects and the offsets of struct
elements within structs, we were using the trick of using getelementpointer 
with a NULL base pointer and then casting the result to an int32/64.
However, since we actually know the target we're compiling for at
compile time, we can use corresponding methods from TargetData to
get these values directly.

This mostly cleans up code, but may make some of the gather/scatter
lowering to loads/stores optimizations work better in the presence
of structures.
2011-11-06 19:31:19 -08:00
Matt Pharr
cabe358c0a Workaround change to linker behavior in LLVM 3.1
Now, the Linker::LinkModules() call doesn't link in any functions
marked as 'internal', which is problematic, since we'd like to have
just about all of the builtins marked as internal so that they are
eliminated after they've been inlined when they are in fact used.

This change removes all of the internal qualifiers in the builtins
and adds a lSetInternalFunctions() routine to builtins.cpp that
sets this property on the functions that need it after they've
been linked in by LinkModules().
2011-11-05 16:57:26 -07:00
Matt Pharr
43a2d510bf Incorporate per-lane offsets for varying data in the front-end.
Previously, it was only in the GatherScatterFlattenOpt optimization pass that
we added the per-lane offsets when we were indexing into varying data.
(Specifically, the case of float foo[]; int index; foo[index], where foo
is an array of varying elements rather than uniform elements.)  Now, this
is done in the front-end as we're first emitting code.

In addition to the basic ugliness of doing this in an optimization pass, 
it was also error-prone to do it there, since we no longer have access
to all of the type information that's around in the front-end.

No functionality or performance change.
2011-11-03 13:15:07 -07:00
Matt Pharr
6084d6aeaf Added disable-handle-pseudo-memory-ops option. 2011-10-31 08:29:13 -07:00
Matt Pharr
d224252b5d Fix bug where multiplying varying array offset by zero would cause crash in optimization passes. 2011-10-31 08:28:51 -07:00
Matt Pharr
8b719e4c4e Fix warnings reported by doxygen 2011-10-20 11:49:54 -07:00
Matt Pharr
074cbc2716 Fix #ifdefs to catch LLVM 3.1svn now as well 2011-10-19 14:01:19 -07:00
Matt Pharr
19087e4761 When casting pointers to ints, choose int32/64 based on target pointer size.
Issue #97.
2011-10-17 06:57:04 -04:00
Matt Pharr
c21e704a5c Fix LLVM 2.9 build. Issue #114 2011-10-15 06:48:20 -07:00
Matt Pharr
9f2aa8d92a Handle ConstantExpressions when computing address+offset vectors for scatter/gather.
In particular, this fixes issue #81, where a global variable access was leading to
ConstantExpressions showing up in this code, which it wasn't previously expecting.
2011-10-14 11:20:08 -07:00
Matt Pharr
2460fa5c83 Improve gather/scatter optimization passes to handle loops better.
Specifically, now we can work through phi nodes in the IR to detect cases
where an index value is actually the same across lanes or is linear across
the lanes.  For example, this is a loop that used to require gathers but
is now turned into vector loads:

    for (int i = programIndex; i < 16; i += programCount)
        sum += a[i];

Fixes issue #107.
2011-10-13 17:01:25 -07:00
Matt Pharr
1198520029 Improve gather->vector load optimization to detect <linear sequence>-<uniform> case.
Previously, we didn't handle subtraction ops when deciphering offsets in order to
try to change gathers t evictor loads.
2011-10-11 13:24:40 -07:00
Matt Pharr
ec5e627e56 Mark internal stdlib functions as "internal" linkage, not "private".
This fixes print() statements on OSX.
(http://llvm.org/bugs/show_bug.cgi?id=11080)
2011-10-06 13:32:20 -07:00
Matt Pharr
ff2a43ac19 Run the CFG simplification pass even when optimization is disabled.
This fixes an issue with undefined SVML symbols with code that called
transcendental functions in the stdandard library, even when the SVML
math library hadn't been selected.
2011-10-06 09:20:50 -07:00
Matt Pharr
2df9da2524 Be careful to not inadvertently match NULL functions in optimization passes. 2011-10-01 08:34:11 -07:00
Matt Pharr
6d39d5fc3e Small cleanups.
Add __num_cores() to the list of symbols to remove from the module at the end.
Fix declarations of mask type for 64-bit atomics to silence warnings.
2011-09-28 16:26:35 -07:00
Matt Pharr
2405dae8e6 Use malloc() to get space for task arguments when compiling to AVX.
This is to work around the LLVM bug/limitation discused in LLVM bug
10841 (http://llvm.org/bugs/show_bug.cgi?id=10841).
2011-09-17 13:38:51 -07:00
Matt Pharr
3607f3e045 Remove support for building with LLVM 2.8. Fixes issue #66.
Both 2.9 and top-of-tree generate substantially better code than
LLVM 2.8 did, so it's not worth fixing the 2.8 build.
2011-09-17 13:18:59 -07:00
Matt Pharr
e2a88d491f Mark the internal __fast_masked_vload function as static 2011-09-13 15:43:48 -07:00
Matt Pharr
30f9dcd4f5 Unroll loops by default, add --opt=disable-loop-unroll to disable.
Issue #78.
2011-09-13 15:37:18 -07:00
Matt Pharr
dd153d3c5c Handle more instruction types when flattening offset vectors.
Generalize the lScalarizeVector() utility routine (used in determining
when we can change gathers/scatters into vector loads/stores, respectively)
to handle vector shuffles and vector loads.  This fixes issue #79, which
provided a case where a gather was being performed even though a vector
load was possible.
2011-09-13 09:43:56 -07:00
Matt Pharr
6375ed9224 AVX: Fix bug with misdeclaration of blend intrinsic.
This was preventing the "convert an all-on blend to one of the
  operand values" optimization from kicking on in AVX.
2011-09-12 06:42:38 -07:00
Matt Pharr
785d8a29d3 Run mem2reg pass even when doing -O0 compiles 2011-09-09 09:24:43 -07:00
Matt Pharr
c86128e8ee AVX: go back to using blend (vs. masked store) when possible.
All of the masked store calls were inhibiting putting values into
registers, which in turn led to a lot of unnecessary stack traffic.
This approach seems to give better code in the end.
2011-09-07 11:26:49 -07:00
Matt Pharr
eb7913f1dd AVX: fix alignment when changing masked load to regular load.
Also added some debugging/tracing stuff (commented out).
Commented out iffy assert that was hitting for avx stuff.
2011-09-01 15:45:49 -07:00
Matt Pharr
f65a20c700 AVX bugfix: when replacing 'all on' masked store with a store,
the rvalue is operand 2, not operand 1 (which is the mask!)
2011-08-31 18:06:29 -07:00
Matt Pharr
ad9e66650d AVX bugfix with alignment for store instructions.
When replacing 'all on' masked store with regular store, set alignment 
to be the vector element alignment, not the alignment for a whole vector. 
(i.e. 4 or 8 byte alignment, not 32 or 64).
2011-08-29 16:58:48 -07:00
Matt Pharr
4ab982bc16 Various AVX fixes (found by inspection).
Emit calls to masked_store, not masked_store_blend, when handling
  masked stores emitted by the frontend.
Fix bug in binary8to16 macro in builtins.m4
Fix bug in 16-wide version of __reduce_add_float
Remove blend function implementations for masked_store_blend for
  AVX; just forward those on to the corresponding real masked store
  functions.
2011-08-26 12:58:02 -07:00
Matt Pharr
fe54f1ad8e Fixes to build with latest LLVM ToT 2011-08-18 08:34:49 +01:00
Matt Pharr
922dbdec06 Fixes to build with LLVM top-of-tree 2011-07-26 10:57:49 +01:00
Matt Pharr
16be1d313e AVX updates / improvements.
Add optimization patterns to detect and simplify masked loads and stores
  with the mask all on / all off.
Enable AVX for LLVM 3.0 builds (still generally hits bugs / unimplemented
  stuff on the LLVM side, but it's getting there).
2011-07-25 07:41:37 +01:00
Matt Pharr
96d40327d0 Fix issue #72: 64 gathers/scatters led to undefined symbols 2011-07-21 14:44:55 +01:00
Matt Pharr
bba7211654 Add support for int8/int16 types. Addresses issues #9 and #42. 2011-07-21 06:57:40 +01:00
Matt Pharr
654cfb4b4b Many fixes for recent LLVM dev tree API changes 2011-07-18 15:54:39 +01:00
Matt Pharr
fe7717ab67 Added shuffle() variant to the standard library that takes two
varying values and a permutation index that spans the concatenation
of the two of them (along the lines of SHUFPS...)
2011-07-02 08:43:35 +01:00
Matt Pharr
2709c354d7 Add support for broadcast(), rotate(), and shuffle() stdlib routines 2011-06-27 17:31:44 -07:00
Matt Pharr
b84167dddd Fixed a number of issues related to memory alignment; a number of places
were expecting vector-width-aligned pointers where in point of fact,
there's no guarantee that they would have been in general.

Removed the aligned memory allocation routines from some of the examples;
they're no longer needed.

No perf. difference on Core2/Core i5 CPUs; older CPUs may see some
regressions.

Still need to update the documentation for this change and finish reviewing
alignment issues in Load/Store instructions generated by .cpp files.
2011-06-23 18:18:33 -07:00
Matt Pharr
18af5226ba Initial commit. 2011-06-21 12:48:50 -07:00