Commit Graph

634 Commits

Author SHA1 Message Date
Jean-Luc Duprat
5d67252ed0 Python scripts now compatible with both 2.x and 3.x releases of python 2012-01-09 13:56:05 -08:00
Matt Pharr
5134de71c0 Fix Windows build (inttypes.h not available) 2012-01-09 09:05:20 -08:00
Matt Pharr
2be1251c70 Fix Makefile on OSX (uname -o not supported) 2012-01-09 07:40:47 -08:00
Matt Pharr
c0161aa17f Merge pull request #154 from palacaze/mingw
Mingw support
2012-01-09 07:37:02 -08:00
Pierre-Antoine Lacaze
b683aa11b1 Fix linking under mingw, libdl is Linux only. 2012-01-09 10:52:46 +01:00
Pierre-Antoine Lacaze
2654bb0112 Handle python installations in non-standards locations. 2012-01-09 10:29:54 +01:00
Pierre-Antoine Lacaze
d8728104b4 Handle the case whereby BUILD_DATE is already defined. 2012-01-09 10:29:16 +01:00
Pierre-Antoine Lacaze
0be1b70fba Mingw has strtoull, make use of it. 2012-01-09 10:28:52 +01:00
Pierre-Antoine Lacaze
a0e9793de3 Shut up warning wrt CONSOLE_SCREEN_BUFFER_INFO initialization 2012-01-09 10:19:46 +01:00
Pierre-Antoine Lacaze
da9200fcee Fix alloca use on mingw. 2012-01-09 10:19:09 +01:00
Pierre-Antoine Lacaze
54e8e8022b suppress warnings about long long arguments 2012-01-09 10:18:39 +01:00
Pierre-Antoine Lacaze
d84cf781da Mingw does not have sysconf, use the msc way of finding processors. 2012-01-09 09:45:40 +01:00
Pierre-Antoine Lacaze
002f27a30f Implement vasprintf and asprintf for platforms lacking them. 2012-01-09 09:44:58 +01:00
Matt Pharr
86d88e9773 run_tests.py: fix to use multiple cores on windows, ignore non ispc inputs 2012-01-08 15:29:20 -08:00
Matt Pharr
fda00afe6e Rename .txt files in docs to .rst (which is what they actually are). 2012-01-08 14:11:04 -08:00
Matt Pharr
be0c77d556 Detect more gather/scatter cases that are actually base+offsets.
We now recognize patterns like (ptr + offset1 + offset2) as being
cases we can handle with the base_offsets variants of the gather/scatter
functions.  (This can come up with multidimensional array indexing,
for example.)

Issue #150.
2012-01-08 14:06:44 -08:00
Matt Pharr
0ed11a7832 Compute SizeOf() and OffsetOf() at compile time in more cases with the generic target.
Really, we only have to be careful about the case where there is a vector of bools 
(i.e. a mask) involved, since the size of that isn't known at compile-time.
(Currently, at least.)
2012-01-08 14:06:44 -08:00
Matt Pharr
ff6971fb15 Use Assert() rather than assert() 2012-01-08 14:06:44 -08:00
Matt Pharr
5b4dbc8167 Fix build of aobench_instrumented example on OSX/Linux 2012-01-08 10:02:43 -08:00
Jean-Luc Duprat
59f4c9985e Python files compatible with python 3 2012-01-06 16:56:09 -08:00
Matt Pharr
8da9be1a09 Add support for 'k', 'M', and 'G' suffixes to integer constants.
(Denoting units of 1024, 1024*1024, and 1024*1024*1024, respectively.)

Issue #128.
2012-01-06 14:47:47 -08:00
Matt Pharr
11033e108e Fix bug that prohibited assignments with pointer expressions on the LHS
Previously, code like "*(ptr+1) = foo" would claim that the LHS was invalid
for an assignment expression.

Issue #138.
2012-01-06 14:21:03 -08:00
Matt Pharr
4f97262cf2 Support function declarations in the definitions of other functions.
As part of this, function declarations are no longer scoped (this is permitted
by the C standard, as it turns out.)  So code like:

   void foo() { void bar(); }
   void bat() { bar(); }

Compiles correctly; the declaration of bar() in foo() is still available in the
definition of bar().

Fixes issue #129.
2012-01-06 13:50:10 -08:00
Matt Pharr
9b68b9087a Fix crash with anonymous function parameters in function definitions.
Issue #135.
2012-01-06 13:28:06 -08:00
Matt Pharr
15cc812e37 Add notion of "unbound" variability to the type system.
Now, when a type is declared without an explicit "uniform" or "varying"
qualifier, its variability is unbound; depending on the context of the
declaration, the variability is later finalized.

Currently, in almost all cases, types with unbound variability are
resolved to varying types; the one exception is typecasts like:
"(int)1"; in this case, the fact that (int) has unbound variability
carries through to the TypeCastExpr, which in turn notices that the
expression being type cast has uniform type and in turn will resolve
(int) to (uniform int).

Fixes issue #127.
2012-01-06 11:52:58 -08:00
Matt Pharr
71317e6aa6 Fix bug in gather/scatter optimization passes.
When flattening chains of insertelement instructions, we didn't
handle the case where the initial insertelement was to a constant
vector (with one value set and the other values undef).

Also generalized the "do all of the instances access the same location"
check to handle the case where some of them are accessing undef
locations; these are ignored in this check, as they should correspond to the
mask being off for that lane anyway.

Fixes issue #149.
2012-01-06 09:19:18 -08:00
Matt Pharr
1abaaee73e Fix bug where we'd sometimes inadvertently lose cv-qualifiers on pointers.
Fixes issue #142.
2012-01-06 08:41:01 -08:00
Matt Pharr
78c6d3c02f Add initial support for 'goto' statements.
ispc now supports goto, but only under uniform control flow--i.e.
it must be possible for the compiler to statically determine that
all program instances will follow the goto.  An error is issued at
compile time if a goto is used when this is not the case.
2012-01-05 12:22:36 -08:00
Matt Pharr
48e9d4af39 Emit code for #includes in emitted C++ code all at the start of the file. 2012-01-05 12:22:35 -08:00
Matt Pharr
cb7ad371c6 Run tests using -O2.
Disconcertingly, this seems to fix some gcc-only crashes with the
generic-16 target (specifically, for half.ispc and for goto-[23].ispc--
those tests run fine with other compilers with generic-16.)
2012-01-05 12:22:35 -08:00
Matt Pharr
2951589825 Redo readme in better-looking rst form 2012-01-04 15:32:29 -08:00
Matt Pharr
f23dc5366a Updates to run_tests.py script.
Add support for the generic targets (using the headers in examples/intrinsics
if none is provided.)

Provide option to run valgrind on the compiled code.

Print a list of all failing tests at the end.
2012-01-04 12:59:03 -08:00
Matt Pharr
e3341176c5 Redo makefiles for the examples.
They're all based off a common examples/common.mk file, so that individual
makefiles are quite simple now.

The common.mk file also provides targets to build the examples using C++
output with the generic-16h or sse4.h files.  These targets don't run by
default, but do run if 'make all' is run.
2012-01-04 12:59:03 -08:00
Matt Pharr
8938e14442 Add support for emitting ~generic vectorized C++ code.
The compiler now supports an --emit-c++ option, which generates generic
vector C++ code.  To actually compile this code, the user must provide
C++ code that implements a variety of types and operations (e.g. adding
two floating-point vector values together, comparing them, etc).

There are two examples of this required code in examples/intrinsics:
generic-16.h is a "generic" 16-wide implementation that does all required
with scalar math; it's useful for demonstrating the requirements of the
implementation.  Then, sse4.h shows a simple implementation of a SSE4
target that maps the emitted function calls to SSE intrinsics.

When using these example implementations with the ispc test suite,
all but one or two tests pass with gcc and clang on Linux and OSX.
There are currently ~10 failures with icc on Linux, and ~50 failures with
MSVC 2010.  (To be fixed in coming days.)

Performance varies: when running the examples through the sse4.h
target, some have the same performance as when compiled with --target=sse4
from ispc directly (options), while noise is 12% slower, rt is 26%
slower, and aobench is 2.2x slower.  The details of this haven't yet been
carefully investigated, but will be in coming days as well.

Issue #92.
2012-01-04 12:59:03 -08:00
Matt Pharr
4151778f5e Modify SizeOf() and StructOffset() to not compute value based on target for generic targets.
Specifically, we want to be able to late-bind on whether the mask is i32s or i1s, so if there's
any chance of ambiguity, we emit code that does the "GEP from a NULL base pointer" trick to
compute the value later in compilation.
2012-01-04 12:59:03 -08:00
Matt Pharr
23b85cd88d Remove broken debugging code. 2012-01-04 12:59:03 -08:00
Matt Pharr
234e5cd3e1 Use vector select for masked store blend if building with LLVM3.1 2012-01-04 12:59:03 -08:00
Matt Pharr
f75c94a8f1 Have aos/soa and broadcast/shuffle/rotate functions provided by the target.
The SSE/AVX targets use the old versions from util.m4, but these functions are
now passed through to the generic targets.
2012-01-04 12:59:03 -08:00
Matt Pharr
848a432640 Fix various small things that were broken with single-bit-per-lane masks.
Also small cleanups to declarations, "no captures" added, etc.
2012-01-04 12:59:03 -08:00
Matt Pharr
dea13979e0 Fix bug in lIs248Splat() in opt.cpp 2012-01-04 11:55:02 -08:00
Matt Pharr
052d34bf5b Various cleanups to optimization code.
Stop using the PassManagerBuilder but add all of the passes directly in code here.
This currently leads to no different behavior, but was useful with experimenting
with disabling the SROA pass when compiling to generic targets.
2012-01-04 11:54:44 -08:00
Matt Pharr
d4c5e82896 Add VSelMovMsk optimization pass.
Various peephole improvements to vector select instructions.
2012-01-04 11:52:27 -08:00
Matt Pharr
562d61caff Added masked load optimization pass.
This pass handles the "all on" and "all off" mask cases appropriately.

Also renamed load_masked stuff in built-ins to masked_load for consistency with
masked_store.
2012-01-04 11:51:26 -08:00
Matt Pharr
75f18c7c66 Add buildispc.bat script for just building the compiler on windows. 2012-01-04 11:44:19 -08:00
Matt Pharr
5d35349dc9 We were (unintentionally) only using structural equivalence to compare struct types.
Now we require that the struct name match for two struct types to be the same.
Added a test to check this.
(Also removed a stale test, movmsk-opt.ispc)
2012-01-04 11:44:00 -08:00
Matt Pharr
1a81173c93 Fix examples/options Makefile to use -O3 for serial builds.
Amazingly, it has been using just -g since the initial commit. :-(
2012-01-03 19:53:45 -08:00
Matt Pharr
1d9201fe3d Add "generic" 4, 8, and 16-wide targets.
When used, these targets end up with calls to undefined functions for all
of the various special vector stuff ispc needs to compile ispc programs
(masked store, gather, min/max, sqrt, etc.).

These targets are not yet useful for anything, but are a step toward
having an option to C++ code with calls out to intrinsics.

Reorganized the directory structure a bit and put the LLVM bitcode used
to define target-specific stuff (as well as some generic built-ins stuff)
into a builtins/ directory.

Note that for building on Windows, it's now necessary to set a LLVM_VERSION
environment variable (with values like LLVM_2_9, LLVM_3_0, LLVM_3_1svn, etc.)
2011-12-19 13:46:50 -08:00
Matt Pharr
6dbb15027a Take advantage of x86's free "scale by 2, 4, or 8" in addressing calculations
When loading from an address that's computed by adding two registers
together, x86 can scale one of them by 2, 4, or 8, for free as part
of the addressing calculation.  This change makes the code generated
for gather and scatter use this.

For the cases where gather/scatter is based on a base pointer and
an integer offset vector, the GSImprovementsPass looks to see if the
integer offsets are being computed as 2/4/8 times some other value.
If so, it extracts the 2x/4x/8x part and leaves the rest there as
the the offsets.  The {gather,scatter}_base_offsets_* functions take
an i32 scale factor, which is passed to them, and then they carefully
generate IR so that it hits LLVM's pattern matching for these scales.

This is particular win on AVX, since it saves us two 4-wide integer
multiplies.

Noise runs 14% faster with this.
Issue #132.
2011-12-16 15:55:44 -08:00
Matt Pharr
f23d030e43 Transition EstimateCost() AST traversal to WalkAST() as well. 2011-12-16 12:24:51 -08:00
Matt Pharr
701334ccf2 Transition type checking to use WalkAST() infrastructure. 2011-12-16 12:24:51 -08:00