Commit Graph

362 Commits

Author SHA1 Message Date
Matt Pharr
a322398c62 When emitting header files, put 'extern' declarations of globals used
in ispc code outside of the ispc namespace.  Fixes issue #64.
2011-08-26 10:03:06 -07:00
Matt Pharr
f22b3a25bd Update command-line processing and usage string now that we have a preprocessor on Windows.
We had been prohibiting Windows users from providing #definitions on the command
  line, which is the wrong thing to do ever since we switched to using the
  clang preprocessor.
2011-08-26 09:58:08 -07:00
Matt Pharr
b67498766e Big rewrite / improvement of target handling.
If no CPU is specified, use the host CPU type, not just a default of "nehalem".
Provide better features strings to the LLVM target machinery.
 -> Thus ensuring that LLVM doesn't generate SSE>2 instructions for the SSE2
    target (Fixes issue #82).
 -> Slight code improvements from using cmovs in generated code now
Use the llvm popcnt intrinsic for the SSE2 target now (it now generates code
  that doesn't call the popcnt instruction now that we properly tell LLVM
  which instructions are and aren't available for SSE2.)
2011-08-26 09:54:45 -07:00
Matt Pharr
c340ff3893 Fixes to build with LLVM ToT 2011-08-25 08:53:56 +01:00
Matt Pharr
b0f59777d4 Silly bug: don't pass NULL to the print() stmt when we want a llvm::Value * that has the value NULL.
(This was causing crashes with print() statements with no additional values to
  be printed.)
2011-08-25 07:48:13 +01:00
Matt Pharr
e14208f489 Update to call DIBuilder::finalize() with LLVM 3.0 2011-08-24 22:28:20 +01:00
Matt Pharr
7756265503 Add double-pumped AVX target (i.e., run 16-wide). Not yet tested. 2011-08-20 11:28:22 +01:00
Matt Pharr
f841b775c3 Small bugfixes in AVX builtins 2011-08-20 09:09:55 +01:00
Matt Pharr
8c921544a0 fix broken test 2011-08-18 20:40:50 +01:00
Matt Pharr
fe54f1ad8e Fixes to build with latest LLVM ToT 2011-08-18 08:34:49 +01:00
Matt Pharr
74c2c8ae07 Linux build fixes 2011-08-17 07:08:44 -07:00
Matt Pharr
87ec7aa10d release notes, housekeeping for 1.0.6 release v1.0.6 2011-08-17 14:55:21 +01:00
Matt Pharr
206c851146 Various improvements to example task systems in examples/.
- Only have a single copy of all of the tasks_*.cpp sample implementations,
  stored in examples/.
- Reduce dynamic storage allocation and locking in task launch code paths.
- Don't have a hard limit of the number of tasks that can be launched on
  Windows (fix issue #85).
2011-08-17 14:31:45 +01:00
Matt Pharr
60bdf1ef8a Modify rt example to also do a set of runs with tasks + SPMD together. 2011-08-17 13:14:32 +01:00
Matt Pharr
d7662b3eb9 Use reduce_equal() in volume rendering example to avoid some gathers.
Modified this example to use reduce_equal() to see if all of the program
instances want to load the 8 sample values around the same voxel.  When
this is the case, we can just do 8 scalar loads, rather than needing to
do a fully general gather.  Once this check fails, it isn't done again,
since it's not likely to start succeeding in the future.  This gives
a ~10% speedup with the low-res data set, and basically no performance
difference with the high-res one.  (It makes sense that the lower-resolution
the voxel sampling, the longer all of the rays will stay in the same set
of voxels.)
2011-08-17 12:37:07 +01:00
Matt Pharr
ecaa57c7c6 Add volume rendering example. (~2.3x speedup from SIMD vs serial code.) 2011-08-17 12:05:37 +01:00
Matt Pharr
fce183c244 Merge branch 'master' of github.com:ispc/ispc 2011-08-17 10:32:49 +01:00
Matt Pharr
7a92f8b3f9 Add MSVC build support for stencil example 2011-08-17 02:28:49 -07:00
Matt Pharr
96af08e789 Print notices about image files being written 2011-08-16 06:31:26 +01:00
Matt Pharr
cb29c10660 Fix tests on Windows: need arch=x86 since ispc_test.exe is a32-bit app 2011-08-15 08:25:08 -07:00
Matt Pharr
04c93043d6 Target handling fixes.
Set the Module's target appropriately when it's first created.
Compile separate 32 and 64 bit versions of the builtins-c bitcocde
  and load the appropriate one based on the target we're compiling
  for.
2011-08-15 16:03:50 +01:00
Matt Pharr
46037c7a11 Merge branch 'master' of github.com:ispc/ispc 2011-08-15 12:44:38 +01:00
Matt Pharr
c570108026 Fix linux build of stencil example 2011-08-15 04:44:17 -07:00
Matt Pharr
230a0fadea Attempt to generate debug info for task parameters. 2011-08-15 12:31:56 +01:00
Matt Pharr
87cf05e0d2 Improve performance of 64-bit reduce_equal implementations.
Just pulling out the elements and doing a set of scalar equality tests
is the best approach for those (nearly 2x better than the rotate and
vector equality check that we use for 32-bit stuff).
2011-08-14 07:39:05 +01:00
Matt Pharr
ff608eef71 Change reduce_equal to return false if no instances are executing 2011-08-14 07:11:45 +01:00
Matt Pharr
f868a63064 Add support for scan operations across program instances (add, and, or). 2011-08-13 20:11:41 +01:00
Matt Pharr
c74116aa24 Fix crasher with malformed program 2011-08-12 07:47:17 +01:00
Matt Pharr
8c534d4d74 Add reduce_equal() function to standard library. 2011-08-10 15:55:55 -07:00
Matt Pharr
d821a11c7c Fix min/max for integer types with AVX. 2011-08-04 06:24:20 -07:00
Matt Pharr
8a138eeb5a vim syntax highlighting for ispc from <andreas.wendleder@googlemail.com> 2011-08-04 05:49:28 -07:00
Matt Pharr
137ea7bde6 Rename semaphore filename to be more generic 2011-08-04 05:28:00 -07:00
Matt Pharr
e05b3981d9 Add stencil example 2011-08-03 13:49:02 -07:00
Matt Pharr
a5a133ccce Do more iterations of RNG test to let result converge to bounds. 2011-08-03 13:44:49 -07:00
Matt Pharr
0ac4f7b620 Add various prefetch functions to the standard library. 2011-08-03 13:31:45 -07:00
Matt Pharr
467f1e71d7 Add fast versions of the float<-->half conversion routines in the stdlib.
These get slightly wrong results for zero and the denorms and also
don't handle the Inf/NaN stuff correctly, but are much more efficient
than the full versions of these routines.
2011-08-03 15:58:42 +01:00
Matt Pharr
a2996ed5d9 More efficient implementation of frandom() in stdlib 2011-08-03 14:28:06 +01:00
Matt Pharr
7d7dd2b204 Merge branch 'master' of github.com:ispc/ispc 2011-08-01 12:16:33 +01:00
Matt Pharr
172794ba5f Release notes and doxygen update for 1.0.5 release v1.0.5 2011-08-01 12:15:42 +01:00
Matt Pharr
9ee6f86c73 Fix Windows build of ispc_test 2011-08-01 04:05:37 -07:00
Matt Pharr
a4bb6b5520 Add new example with implementation of Perlin Noise
~4.2x speedup versus serial on OSX / gcc.
~2.9x speedup versus serial on Windows / MSVC.
2011-08-01 10:33:18 +01:00
Matt Pharr
a552927a6a Cleanup implementation of target builtins code.
- Renamed stdlib-sse.ll to builtins-sse.ll (etc.) in an attempt to better indicate
the fact that the stuff in those files has a role beyond implementing stuff for
the standard library.
- Moved declarations of the various __pseudo_* functions from being done with LLVM
API calls in builtins.cpp to just straight up declarations in LLVM assembly
language in builtins.m4.  (Much less code to do it this way, and more clear what's
going on.)
2011-08-01 05:58:43 +01:00
Matt Pharr
2d52c732f1 Doc updates for recent new swizzle support 2011-07-31 19:03:55 +02:00
Matt Pharr
25676d5643 When --debug is specified, only print the entire module bitcode twice.
Fixes issue #77.  Previously, it dumped out the entire module every time
a new function was defined, which got to be quite a lot of output by
the time the stdlib functions were all added!
2011-07-29 07:26:37 +02:00
Matt Pharr
158bd6ef9e Fix bug with initializer expression lists for globlal/static array-typed variables. 2011-07-28 11:38:56 +01:00
Matt Pharr
7f662de6e3 Emit debug declaration of variables before the instructions for their initializers. 2011-07-28 11:05:02 +01:00
Matt Pharr
80ca02af58 Add missing #include, fix Linux build. Fixes issue #75. 2011-07-27 10:51:13 +01:00
Matt Pharr
8aea4a836d Fix crash when trying to generate debug info with program source from stdin 2011-07-27 07:42:47 +01:00
Matt Pharr
922dbdec06 Fixes to build with LLVM top-of-tree 2011-07-26 10:57:49 +01:00
Matt Pharr
e230d2c9ca Make the target argument work in the run_tests.sh script 2011-07-26 10:57:39 +01:00