Matt Pharr e144724979 Improve performance of global atomics, taking advantage of associativity.
For associative atomic ops (add, and, or, xor), we can take advantage of
their associativity to do just a single hardware atomic instruction, 
rather than one for each of the running program instances (as the previous
implementation did.)

The basic approach is to locally compute a reduction across the active
program instances with the given op and to then issue a single HW atomic
with that reduced value as the operand.  We then take the old value that
was stored in the location that is returned from the HW atomic op and
use that to compute the values to return to each of the program instances
(conceptually representing the cumulative effect of each of the preceding
program instances having performed their atomic operation.)

Issue #56.
2011-08-31 05:35:01 -07:00
2011-06-21 12:48:50 -07:00
2011-06-21 12:48:50 -07:00
2011-07-26 10:57:49 +01:00
2011-08-27 08:59:25 -07:00
2011-06-21 12:48:50 -07:00
2011-08-30 14:48:22 -07:00
2011-06-21 12:48:50 -07:00
2011-06-21 13:24:25 -07:00
2011-06-21 12:48:50 -07:00
2011-07-17 16:43:05 +02:00
2011-07-17 16:43:05 +02:00
2011-06-21 12:48:50 -07:00

==============================
Intel(r) SPMD Program Compiler
==============================

Welcome to the Intel(r) SPMD Program Compiler (ispc)!  

ispc is a new compiler for "single program, multiple data" (SPMD)
programs. Under the SPMD model, the programmer writes a program that mostly
appears to be a regular serial program, though the execution model is
actually that a number of program instances execute in parallel on the
hardware. ispc compiles a C-based SPMD programming language to run on the
SIMD units of CPUs; it frequently provides a a 3x or more speedup on CPUs
with 4-wide SSE units, without any of the difficulty of writing intrinsics
code.

ispc is an open source compiler under the BSD license; see the file
LICENSE.txt.  ispc supports Windows, Mac, and Linux, with both x86 and
x86-64 targets. It currently supports the SSE2 and SSE4 instruction sets,
though support for AVX should be available soon.

For more information and examples, as well as a wiki and the bug database,
see the ispc distribution site, http://ispc.github.com.
Description
No description provided
Readme 34 MiB
Languages
C++ 63.5%
LLVM 19.1%
M4 11.6%
Python 4.5%
Makefile 0.5%
Other 0.6%