Commit Graph

140 Commits

Author SHA1 Message Date
Matt Pharr
777343331e Print numeric version number with --verison. 2012-03-19 14:41:25 -07:00
Matt Pharr
db5db5aefd Add native support for (AO)SOA data layout.
There's now a SOA variability class (in addition to uniform,
varying, and unbound variability); the SOA factor must be a
positive power of 2.

When applied to a type, the leaf elements of the type (i.e.
atomic types, pointer types, and enum types) are widened out
into arrays of the given SOA factor.  For example, given

struct Point { float x, y, z; };

Then "soa<8> Point" has a memory layout of "float x[8], y[8],
z[8]".

Furthermore, array indexing syntax has been augmented so that
when indexing into arrays of SOA-variability data, the two-stage
indexing (first into the array of soa<> elements and then into
the leaf arrays of SOA data) is performed automatically.
2012-03-05 09:58:10 -08:00
Matt Pharr
73bf552cd6 Add support for coalescing memory accesses from gathers.
There are two related optimizations that happen now.  (These
currently only apply for gathers where the mask is known to be
all on, and to gathers that are accessing 32-bit sized elements,
but both of these may be generalized in the future.)

First, for any single gather, we are now more flexible in mapping it
to individual memory operations.  Previously, we would only either map
it to a general gather (one scalar load per SIMD lane), or an 
unaligned vector load (if the program instances could be determined
to be accessing a sequential set of locations in memory.)

Now, we are able to break gathers into scalar, 2-wide (i.e. 64-bit),
4-wide, or 8-wide loads.  Further, we now generate code that shuffles
these loads around.  Doing fewer, larger loads in this manner, when
possible, can be more efficient.

Second, we can coalesce memory accesses across multiple gathers. If 
we have a series of gathers without any memory writes in the middle,
then we try to analyze their reads collectively and choose an efficient
set of loads for them.  Not only does this help if different gathers
reuse values from the same location in memory, but it's specifically
helpful when data with AOS layout is being accessed; in this case,
we're often able to generate wide vector loads and appropriate shuffles
automatically.
2012-02-10 13:10:39 -08:00
Matt Pharr
bb8e13e3c9 Add support for -I command-line argument to specify #include search directories. 2012-02-07 08:39:01 -08:00
Matt Pharr
3efbc71a01 Add fuzz testing of input programs.
When the --fuzz-test command-line option is given, the input program
will be randomly perturbed by the lexer in an effort to trigger
assertions or crashes in the compiler (neither of which should ever
happen, even for malformed programs.)
2012-02-06 15:34:47 -08:00
Matt Pharr
724a843bbd Add --quiet option to supress all diagnostic output 2012-02-06 12:39:09 -08:00
Matt Pharr
664dc3bdda Add support for "new" and "delete" to the language.
Issue #139.
2012-01-27 14:47:06 -08:00
Matt Pharr
b67446d998 Add support for "switch" statements.
Switches with both uniform and varying "switch" expressions are
supported.  Switch statements with varying expressions and very
large numbers of labels may not perform well; some issues to be
filed shortly will track opportunities for improving these.
2012-01-11 09:16:31 -08:00
Matt Pharr
78c6d3c02f Add initial support for 'goto' statements.
ispc now supports goto, but only under uniform control flow--i.e.
it must be possible for the compiler to statically determine that
all program instances will follow the goto.  An error is issued at
compile time if a goto is used when this is not the case.
2012-01-05 12:22:36 -08:00
Matt Pharr
4151778f5e Modify SizeOf() and StructOffset() to not compute value based on target for generic targets.
Specifically, we want to be able to late-bind on whether the mask is i32s or i1s, so if there's
any chance of ambiguity, we emit code that does the "GEP from a NULL base pointer" trick to
compute the value later in compilation.
2012-01-04 12:59:03 -08:00
Matt Pharr
1d9201fe3d Add "generic" 4, 8, and 16-wide targets.
When used, these targets end up with calls to undefined functions for all
of the various special vector stuff ispc needs to compile ispc programs
(masked store, gather, min/max, sqrt, etc.).

These targets are not yet useful for anything, but are a step toward
having an option to C++ code with calls out to intrinsics.

Reorganized the directory structure a bit and put the LLVM bitcode used
to define target-specific stuff (as well as some generic built-ins stuff)
into a builtins/ directory.

Note that for building on Windows, it's now necessary to set a LLVM_VERSION
environment variable (with values like LLVM_2_9, LLVM_3_0, LLVM_3_1svn, etc.)
2011-12-19 13:46:50 -08:00
Matt Pharr
f30a5dea79 Linux build fixes 2011-12-15 12:23:26 -08:00
Matt Pharr
8d1b77b235 Have assertion macro and FATAL() text ask user to file a bug, provide URL to do so.
Switch to Assert() from assert() to make it clear it's not the C stdlib one we're
using any more.
2011-12-15 11:11:16 -08:00
Matt Pharr
46bfef3fce Add option to turn off codegen improvements when mask 'all on' is statically known. 2011-12-11 16:16:36 -08:00
Matt Pharr
765d86076f Basic support for AVX2 when building with LLVM3.1svn
For now this target just uses the same builtins-*.ll files as the
regular AVX1 target.  Once the gather intrinsic is available from
LLVM, we'll want to have custom target files that call out to that
for gathers. (The integer min/max intrinsics should be wired up to
the __{min,max}_varying_{int,uint}*() builtins at that point as
well.)
2011-12-06 08:20:53 -08:00
Matt Pharr
c995902796 Add --werror flag to treat warnings as errors.
The specific need for it was so that tests in tests_errors
can test to see if a desired diagnostic warning is issued
(like ptrcast-lose-info does.)
2011-11-30 05:51:53 -08:00
Matt Pharr
e780662a3f Issue error if unsupported version of LLVM is used. 2011-11-29 17:25:35 -08:00
Matt Pharr
975db80ef6 Add support for pointers to the language.
Pointers can be either uniform or varying, and behave correspondingly.
e.g.: "uniform float * varying" is a varying pointer to uniform float
data in memory, and "float * uniform" is a uniform pointer to varying
data in memory.  Like other types, pointers are varying by default.

Pointer-based expressions, & and *, sizeof, ->, pointer arithmetic,
and the array/pointer duality all bahave as in C.  Array arguments
to functions are converted to pointers, also like C.

There is a built-in NULL for a null pointer value; conversion from
compile-time constant 0 values to NULL still needs to be implemented.

Other changes:
- Syntax for references has been updated to be C++ style; a useful
  warning is now issued if the "reference" keyword is used.
- It is now illegal to pass a varying lvalue as a reference parameter
  to a function; references are essentially uniform pointers.
  This case had previously been handled via special case call by value
  return code.  That path has been removed, now that varying pointers
  are available to handle this use case (and much more).
- Some stdlib routines have been updated to take pointers as
  arguments where appropriate (e.g. prefetch and the atomics).
  A number of others still need attention.
- All of the examples have been updated
- Many new tests

TODO: documentation
2011-11-27 13:09:59 -08:00
Matt Pharr
068ea3e4c4 Better SourcePos reporting for gathers/scatters 2011-11-21 10:26:53 -08:00
Matt Pharr
79684a0bed Add support for running tests that are expected to fail
Also add should-fail tests that exercise const and decl
initializers
2011-11-14 08:45:41 -08:00
Matt Pharr
f8eb100c60 Use llvm TargetData to find object sizes, offsets.
Previously, to compute the size of objects and the offsets of struct
elements within structs, we were using the trick of using getelementpointer 
with a NULL base pointer and then casting the result to an int32/64.
However, since we actually know the target we're compiling for at
compile time, we can use corresponding methods from TargetData to
get these values directly.

This mostly cleans up code, but may make some of the gather/scatter
lowering to loads/stores optimizations work better in the presence
of structures.
2011-11-06 19:31:19 -08:00
Matt Pharr
afcd42028f Add support for function pointers.
Both uniform and varying function pointers are supported; when a function
is called through a varying function pointer, each unique function pointer
value across the running program instances is called once for the set of
active program instances that want to call it.
2011-11-03 16:14:14 -07:00
Matt Pharr
7d6f89c8d2 Improvements to source file position tracking.
Be better about tracking the full extent of expressions in the parser;
this leads to more intelligible error messages when we indicate where
exactly the error happened.
2011-11-03 16:14:14 -07:00
Matt Pharr
6084d6aeaf Added disable-handle-pseudo-memory-ops option. 2011-10-31 08:29:13 -07:00
Matt Pharr
074cbc2716 Fix #ifdefs to catch LLVM 3.1svn now as well 2011-10-19 14:01:19 -07:00
Matt Pharr
f45ab0744e Significantly reduce the tendrils of DeclSpecs/Declarator/Declaration code
The stuff in decl.h/decl.cpp is messy, largely due to its close mapping
to C-style variable declarations.  This checkin has updated code throughout
all of the declaration statement, variable, and function code that operates
on symbols and types directly.  Thus, Decl* related stuff is now localized
to decl.h/decl.cpp and the parser.

Issue #13.
2011-10-18 15:37:29 -07:00
Matt Pharr
fc2954419d Update cost model to include "if" overhead in "if" statement calculation. 2011-10-15 13:52:08 -07:00
Matt Pharr
422b8268a9 Add assert() statement support. Issue #106. 2011-10-15 13:50:05 -07:00
Matt Pharr
f9c67ff806 Explicit representation of ASTs for all the functions in a compile unit.
Added AST and Function classes.
Now, we parse the whole file and build up the AST for all of the
  functions in the Module before we emit IR for the functions (vs. before,
  when we generated IR along the way as we parsed the source file.)
2011-10-06 15:35:27 -07:00
Matt Pharr
06975bc7ab Add support for compiling to multiple targets.
If a flag along the lines of "--target=sse4,avx-x2" is provided on the command-line,
then the program will be compiled for each of the given targets, with a separate
output file generated for each one.  Further, an output file with dispatch functions
that check the current system's CPU and then chooses the best available variant
is also created.

Issue #11.
2011-10-04 16:01:55 -07:00
Matt Pharr
9921b8e530 Predicated 'if' statement performance improvements.
Go back to running both sides of 'if' statements with masking and without
branching if we can determine that the code is relatively simple (as per
the simple cost model), and is safe to run even if the mask is 'all off'.
This gives a bit of a performance improvement for some of the examples
(most notably, the ray tracer), and is the code that one wants generated
in this case anyhow.
2011-09-19 09:54:09 -07:00
Matt Pharr
ca87579f23 Add a very simple cost model to estimate runtime cost of running code.
This is currently only used to decide whether it's worth doing an
"are all lanes running" check at the start of functions--for small
functions, it's not worth the overhead.

The cost is estimated relatively early in compilation (e.g. before
we know if an array access is a scatter/gather or not, before
constant folding, etc.), so there are many known shortcomings.
2011-09-16 15:09:17 -07:00
Matt Pharr
30f9dcd4f5 Unroll loops by default, add --opt=disable-loop-unroll to disable.
Issue #78.
2011-09-13 15:37:18 -07:00
Matt Pharr
83f22f1939 Add experimental --fast-masked-vload flag for SSE. 2011-09-12 12:29:33 -07:00
Matt Pharr
c76ef7b174 Add command-line option to specify position-independent codegen 2011-09-06 11:12:43 -07:00
Matt Pharr
b67498766e Big rewrite / improvement of target handling.
If no CPU is specified, use the host CPU type, not just a default of "nehalem".
Provide better features strings to the LLVM target machinery.
 -> Thus ensuring that LLVM doesn't generate SSE>2 instructions for the SSE2
    target (Fixes issue #82).
 -> Slight code improvements from using cmovs in generated code now
Use the llvm popcnt intrinsic for the SSE2 target now (it now generates code
  that doesn't call the popcnt instruction now that we properly tell LLVM
  which instructions are and aren't available for SSE2.)
2011-08-26 09:54:45 -07:00
Matt Pharr
04c93043d6 Target handling fixes.
Set the Module's target appropriately when it's first created.
Compile separate 32 and 64 bit versions of the builtins-c bitcocde
  and load the appropriate one based on the target we're compiling
  for.
2011-08-15 16:03:50 +01:00
Matt Pharr
f0f876c3ec Add support for enums. 2011-07-17 16:43:05 +02:00
Matt Pharr
17e5c8b7c2 Fix LLVM 2.9 build. 2011-07-13 09:24:02 +01:00
Matt Pharr
18af5226ba Initial commit. 2011-06-21 12:48:50 -07:00