135 lines
6.0 KiB
Plaintext
135 lines
6.0 KiB
Plaintext
=== v1.0.6 === (17 August 2011)
|
|
|
|
Some additional cross-program instance operations have been added to the
|
|
standard library. reduce_equal() checks to see if the given value is the
|
|
same across all running program instances, and exclusive_scan_{and,or,and}()
|
|
computes a scan over the given value in the running program instances.
|
|
See the documentation of these new routines for more information:
|
|
http://ispc.github.com/ispc.html#cross-program-instance-operations.
|
|
|
|
The simple task system implementations used in the examples have been
|
|
improved. The Windows version no nlonger has a hard limit on the number of
|
|
tasks that can be launched, and all versions have less dynamic memory
|
|
allocation and less locking. More of the examples now have paths that also
|
|
measure performance using tasks along with SPMD vectorization.
|
|
|
|
Two new examples have been added: one that shows the implementation of a
|
|
ray-marching volume rendering algorithm, and one that shows a 3D stencil
|
|
computation, as might be done for PDE solutions.
|
|
|
|
Standard library routines to issue prefetches have been added. See the
|
|
documentation for more details: http://ispc.github.com/ispc.html#prefetches.
|
|
|
|
Fast versions of the float to half-precision float conversion routines have
|
|
been added. For more details, see:
|
|
http://ispc.github.com/ispc.html#conversions-to-and-from-half-precision-floats.
|
|
|
|
There is the usual set of small bug fixes. Notably, a number of details
|
|
related to handling 32 versus 64 bit targets have been fixed, which in turn
|
|
has fixed a bug related to tasks having incorrect values for pointers
|
|
passed to them.
|
|
|
|
=== v1.0.5 === (1 August 2011)
|
|
|
|
Multi-element vector swizzles are supported; for example, given a 3-wide
|
|
vector "foo", then expressions like "foo.zyx" and "foo.yz" can be used to
|
|
construct other short vectors. See
|
|
http://ispc.github.com/ispc.html#short-vector-types
|
|
for more details. (Thanks to Pete Couperus for implementing this code!).
|
|
|
|
int8 and int16 datatypes are now supported. It is still generally more
|
|
efficient to use int32 for intermediate computations, even if the in-memory
|
|
format is int8 or int16.
|
|
|
|
There are now standard library routines to convert to and from 'half'-format
|
|
floating-point values (half_to_float() and float_to_half()).
|
|
|
|
There is a new example with an implementation of Perlin's Noise function
|
|
(examples/noise). It shows a speedup of approximately 4.2x versus a C
|
|
implementation on OSX and a 2.9x speedup versus C on Windows.
|
|
|
|
=== v1.0.4 === (18 July 2011)
|
|
|
|
enums are now supported in ispc; see the section on enumeration types in
|
|
the documentation (http://ispc.github.com/ispc.html#enumeration-types) for
|
|
more informaiton.
|
|
|
|
bools are converted to integers with zero extension, not sign extension as
|
|
before (i.e. a 'true' bool converts to the value one, not 'all bits on'.)
|
|
For cases where sign extension is still desired, there is a
|
|
sign_extend(bool) function in the standard library.
|
|
|
|
Support for 64-bit types in the standard library is much more complete than
|
|
before.
|
|
|
|
64-bit integer constants are now supported by the parser.
|
|
|
|
Storage for parameters to tasks is now allocated dynamically on Windows,
|
|
rather than on the stack; with this fix, all tests now run correctly on
|
|
Windows.
|
|
|
|
There is now support for atomic swap and compare/exchange with float and
|
|
double types.
|
|
|
|
A number of additional small bugs have been fixed and a number of cases
|
|
where the compiler would crash given a malformed program have been fixed.
|
|
|
|
=== v1.0.3 === (4 July 2011)
|
|
|
|
ispc now has a bulit-in pre-processor (from LLVM's clang compiler).
|
|
(Thanks to Pete Couperus for this patch!) It is therefore no longer
|
|
necessary to use cl.exe for preprocessing on Windows; the MSVC proejct
|
|
files for the examples have been updated accordingly.
|
|
|
|
There is another variant of the shuffle() function int the standard
|
|
library: "<type> shuffle(<type> v0, <type> v1, int permute)", where the
|
|
permutation vector indexes over the concatenation of the two vectors
|
|
(e.g. the value 0 corresponds to the first element of v0, the value
|
|
2*programCount-1 corresponds to the last element of v1, etc.)
|
|
|
|
ispc now supports the usual range of atomic operations (add, subtract, min,
|
|
max, and, or, and xor) as well as atomic swap and atomic compare and
|
|
exchange. There is also a facility for inserting memory fences. See the
|
|
"Atomic Operations and Memory Fences" section of the user's guide
|
|
(http://ispc.github.com/ispc.html#atomic-operations-and-memory-fences) for
|
|
more information.
|
|
|
|
There are now both 'signed' and 'unsigned' variants of the standard library
|
|
functions like packed_load_active() that take references to arrays of
|
|
signed int32s and unsigned int32s respectively. (The
|
|
{load_from,store_to}_{int8,int16}() functions have similarly been augmented
|
|
to have both 'signed' and 'unsigned' variants.)
|
|
|
|
In initializer expressions with variable declarations, it is no longer
|
|
legal to initialize arrays and structs with single scalar values that then
|
|
initialize their members; they now must be initialized with initializer
|
|
lists in braces (or initialized after of the initializer with a loop over
|
|
array elements, etc.)
|
|
|
|
=== v1.0.2 === (1 July 2011)
|
|
|
|
Floating-point hexidecimal constants are now parsed correctly on Windows
|
|
(fixes issue #16).
|
|
|
|
SSE2 is now the default target if --cpu=atom is given in the command line
|
|
arguments and another target isn't explicitly specified.
|
|
|
|
The standard library now provides broadcast(), rotate(), and shuffle()
|
|
routines for efficient communication between program instances.
|
|
|
|
The MSVC solution files to build the examples on Windows now use
|
|
/fpmath:fast when building.
|
|
|
|
=== v1.0.1 === (24 June 2011)
|
|
|
|
ispc no longer requires that pointers to memory that are passed in to ispc
|
|
have alignment equal to the targets vector width; now alignment just has to
|
|
be the regular element alignment (e.g. 4 bytes for floats, etc.) This
|
|
change also fixed a number of cases where it previously incorrectly
|
|
generated aligned load/store instructions in cases where the address wasn't
|
|
actually aligned (even if the base address passed into ispc code was).
|
|
|
|
=== v1.0 === (21 June 2011)
|
|
|
|
Initial Release
|