151 Commits

Author SHA1 Message Date
Matt Pharr
87ec7aa10d release notes, housekeeping for 1.0.6 release v1.0.6 2011-08-17 14:55:21 +01:00
Matt Pharr
206c851146 Various improvements to example task systems in examples/.
- Only have a single copy of all of the tasks_*.cpp sample implementations,
  stored in examples/.
- Reduce dynamic storage allocation and locking in task launch code paths.
- Don't have a hard limit of the number of tasks that can be launched on
  Windows (fix issue #85).
2011-08-17 14:31:45 +01:00
Matt Pharr
60bdf1ef8a Modify rt example to also do a set of runs with tasks + SPMD together. 2011-08-17 13:14:32 +01:00
Matt Pharr
d7662b3eb9 Use reduce_equal() in volume rendering example to avoid some gathers.
Modified this example to use reduce_equal() to see if all of the program
instances want to load the 8 sample values around the same voxel.  When
this is the case, we can just do 8 scalar loads, rather than needing to
do a fully general gather.  Once this check fails, it isn't done again,
since it's not likely to start succeeding in the future.  This gives
a ~10% speedup with the low-res data set, and basically no performance
difference with the high-res one.  (It makes sense that the lower-resolution
the voxel sampling, the longer all of the rays will stay in the same set
of voxels.)
2011-08-17 12:37:07 +01:00
Matt Pharr
ecaa57c7c6 Add volume rendering example. (~2.3x speedup from SIMD vs serial code.) 2011-08-17 12:05:37 +01:00
Matt Pharr
fce183c244 Merge branch 'master' of github.com:ispc/ispc 2011-08-17 10:32:49 +01:00
Matt Pharr
7a92f8b3f9 Add MSVC build support for stencil example 2011-08-17 02:28:49 -07:00
Matt Pharr
96af08e789 Print notices about image files being written 2011-08-16 06:31:26 +01:00
Matt Pharr
cb29c10660 Fix tests on Windows: need arch=x86 since ispc_test.exe is a32-bit app 2011-08-15 08:25:08 -07:00
Matt Pharr
04c93043d6 Target handling fixes.
Set the Module's target appropriately when it's first created.
Compile separate 32 and 64 bit versions of the builtins-c bitcocde
  and load the appropriate one based on the target we're compiling
  for.
2011-08-15 16:03:50 +01:00
Matt Pharr
46037c7a11 Merge branch 'master' of github.com:ispc/ispc 2011-08-15 12:44:38 +01:00
Matt Pharr
c570108026 Fix linux build of stencil example 2011-08-15 04:44:17 -07:00
Matt Pharr
230a0fadea Attempt to generate debug info for task parameters. 2011-08-15 12:31:56 +01:00
Matt Pharr
87cf05e0d2 Improve performance of 64-bit reduce_equal implementations.
Just pulling out the elements and doing a set of scalar equality tests
is the best approach for those (nearly 2x better than the rotate and
vector equality check that we use for 32-bit stuff).
2011-08-14 07:39:05 +01:00
Matt Pharr
ff608eef71 Change reduce_equal to return false if no instances are executing 2011-08-14 07:11:45 +01:00
Matt Pharr
f868a63064 Add support for scan operations across program instances (add, and, or). 2011-08-13 20:11:41 +01:00
Matt Pharr
c74116aa24 Fix crasher with malformed program 2011-08-12 07:47:17 +01:00
Matt Pharr
8c534d4d74 Add reduce_equal() function to standard library. 2011-08-10 15:55:55 -07:00
Matt Pharr
d821a11c7c Fix min/max for integer types with AVX. 2011-08-04 06:24:20 -07:00
Matt Pharr
8a138eeb5a vim syntax highlighting for ispc from <andreas.wendleder@googlemail.com> 2011-08-04 05:49:28 -07:00
Matt Pharr
137ea7bde6 Rename semaphore filename to be more generic 2011-08-04 05:28:00 -07:00
Matt Pharr
e05b3981d9 Add stencil example 2011-08-03 13:49:02 -07:00
Matt Pharr
a5a133ccce Do more iterations of RNG test to let result converge to bounds. 2011-08-03 13:44:49 -07:00
Matt Pharr
0ac4f7b620 Add various prefetch functions to the standard library. 2011-08-03 13:31:45 -07:00
Matt Pharr
467f1e71d7 Add fast versions of the float<-->half conversion routines in the stdlib.
These get slightly wrong results for zero and the denorms and also
don't handle the Inf/NaN stuff correctly, but are much more efficient
than the full versions of these routines.
2011-08-03 15:58:42 +01:00
Matt Pharr
a2996ed5d9 More efficient implementation of frandom() in stdlib 2011-08-03 14:28:06 +01:00
Matt Pharr
7d7dd2b204 Merge branch 'master' of github.com:ispc/ispc 2011-08-01 12:16:33 +01:00
Matt Pharr
172794ba5f Release notes and doxygen update for 1.0.5 release v1.0.5 2011-08-01 12:15:42 +01:00
Matt Pharr
9ee6f86c73 Fix Windows build of ispc_test 2011-08-01 04:05:37 -07:00
Matt Pharr
a4bb6b5520 Add new example with implementation of Perlin Noise
~4.2x speedup versus serial on OSX / gcc.
~2.9x speedup versus serial on Windows / MSVC.
2011-08-01 10:33:18 +01:00
Matt Pharr
a552927a6a Cleanup implementation of target builtins code.
- Renamed stdlib-sse.ll to builtins-sse.ll (etc.) in an attempt to better indicate
the fact that the stuff in those files has a role beyond implementing stuff for
the standard library.
- Moved declarations of the various __pseudo_* functions from being done with LLVM
API calls in builtins.cpp to just straight up declarations in LLVM assembly
language in builtins.m4.  (Much less code to do it this way, and more clear what's
going on.)
2011-08-01 05:58:43 +01:00
Matt Pharr
2d52c732f1 Doc updates for recent new swizzle support 2011-07-31 19:03:55 +02:00
Matt Pharr
25676d5643 When --debug is specified, only print the entire module bitcode twice.
Fixes issue #77.  Previously, it dumped out the entire module every time
a new function was defined, which got to be quite a lot of output by
the time the stdlib functions were all added!
2011-07-29 07:26:37 +02:00
Matt Pharr
158bd6ef9e Fix bug with initializer expression lists for globlal/static array-typed variables. 2011-07-28 11:38:56 +01:00
Matt Pharr
7f662de6e3 Emit debug declaration of variables before the instructions for their initializers. 2011-07-28 11:05:02 +01:00
Matt Pharr
80ca02af58 Add missing #include, fix Linux build. Fixes issue #75. 2011-07-27 10:51:13 +01:00
Matt Pharr
8aea4a836d Fix crash when trying to generate debug info with program source from stdin 2011-07-27 07:42:47 +01:00
Matt Pharr
922dbdec06 Fixes to build with LLVM top-of-tree 2011-07-26 10:57:49 +01:00
Matt Pharr
e230d2c9ca Make the target argument work in the run_tests.sh script 2011-07-26 10:57:39 +01:00
Matt Pharr
d0674b1706 When doing << or >> operators, don't convert the return type to the type of the shift amount.
Fixes issue #73.  Previously, if we had e.g. an int16 type that was being shifted
left by 1, then the constant integer 1 would come in as an int32, we'd convert
the int16 to an int32, and then we'd do the shift.  Now, for shifts, the type
of the expression is always the same as the type of the value being shifted.
2011-07-25 23:36:05 +01:00
Matt Pharr
16be1d313e AVX updates / improvements.
Add optimization patterns to detect and simplify masked loads and stores
  with the mask all on / all off.
Enable AVX for LLVM 3.0 builds (still generally hits bugs / unimplemented
  stuff on the LLVM side, but it's getting there).
2011-07-25 07:41:37 +01:00
Matt Pharr
0932dcd98b Fix build with llvm top-of-tree 2011-07-23 08:35:45 +01:00
Matt Pharr
43a619669f Fix memory bug where we were accessing memory that had been freed.
(The string for which c_str() was called was just a temporary, so its destructor ran after funcName was initialized, leading funcName to point at freed memory.)
2011-07-22 13:15:50 +01:00
Pete Couperus
59036cdf5b Add support for multi-element vector swizzles. Issue #17.
This commit adds support for swizzles like "foo.zy" (if "foo" is,
for example, a float<3> type) as rvalues.  (Still need support for
swizzles as lvalues.)
2011-07-22 13:10:14 +01:00
Matt Pharr
98a2d69e72 Add code to check signatures of LLVM intrinsic declarations in stdlib*.ll files.
Fix a case where we were using the wrong signature for stmxcsr and ldmxcsr.
2011-07-22 12:53:17 +01:00
Matt Pharr
da0fd93315 AVX fixes: add missing 8/16-bit gathers and scatters, set features string appropriately when AVX is enabled. 2011-07-22 12:36:44 +01:00
Matt Pharr
165f90357f Tiny cleanups, doc update re int8/16 performance 2011-07-21 16:04:16 +01:00
Matt Pharr
8ef3df57c5 Add support for in-memory half float data. Fixes issue #10 2011-07-21 15:55:45 +01:00
Matt Pharr
96d40327d0 Fix issue #72: 64 gathers/scatters led to undefined symbols 2011-07-21 14:44:55 +01:00
Matt Pharr
bba7211654 Add support for int8/int16 types. Addresses issues #9 and #42. 2011-07-21 06:57:40 +01:00