diff --git a/docs/ReleaseNotes.txt b/docs/ReleaseNotes.txt index 50b7b7b4..e9f3df09 100644 --- a/docs/ReleaseNotes.txt +++ b/docs/ReleaseNotes.txt @@ -1,3 +1,34 @@ +=== v1.0.6 === (17 August 2011) + +Some additional cross-program instance operations have been added to the +standard library. reduce_equal() checks to see if the given value is the +same across all running program instances, and exclusive_scan_{and,or,and}() +computes a scan over the given value in the running program instances. +See the documentation of these new routines for more information: +http://ispc.github.com/ispc.html#cross-program-instance-operations. + +The simple task system implementations used in the examples have been +improved. The Windows version no nlonger has a hard limit on the number of +tasks that can be launched, and all versions have less dynamic memory +allocation and less locking. More of the examples now have paths that also +measure performance using tasks along with SPMD vectorization. + +Two new examples have been added: one that shows the implementation of a +ray-marching volume rendering algorithm, and one that shows a 3D stencil +computation, as might be done for PDE solutions. + +Standard library routines to issue prefetches have been added. See the +documentation for more details: http://ispc.github.com/ispc.html#prefetches. + +Fast versions of the float to half-precision float conversion routines have +been added. For more details, see: +http://ispc.github.com/ispc.html#conversions-to-and-from-half-precision-floats. + +There is the usual set of small bug fixes. Notably, a number of details +related to handling 32 versus 64 bit targets have been fixed, which in turn +has fixed a bug related to tasks having incorrect values for pointers +passed to them. + === v1.0.5 === (1 August 2011) Multi-element vector swizzles are supported; for example, given a 3-wide diff --git a/docs/ispc.txt b/docs/ispc.txt index 543e6c99..64344405 100644 --- a/docs/ispc.txt +++ b/docs/ispc.txt @@ -1988,6 +1988,18 @@ function returns the 16 bits that are the closest match to the given int16 float_to_half(float f) uniform int16 float_to_half(uniform float f) +There are also faster versions of these functions that don't worry about +handling floating point infinity, "not a number" and denormalized numbers +correctly. These are faster than the above functions, but are less +precise. + +:: + + float half_to_float_fast(unsigned int16 h) + uniform float half_to_float_fast(uniform unsigned int16 h) + int16 float_to_half_fast(float f) + uniform int16 float_to_half_fast(uniform float f) + Atomic Operations and Memory Fences ----------------------------------- diff --git a/doxygen.cfg b/doxygen.cfg index 093143c9..0048a5e0 100644 --- a/doxygen.cfg +++ b/doxygen.cfg @@ -31,7 +31,7 @@ PROJECT_NAME = "Intel SPMD Program Compiler" # This could be handy for archiving the generated documentation or # if some version control system is used. -PROJECT_NUMBER = 1.0.5 +PROJECT_NUMBER = 1.0.6 # The OUTPUT_DIRECTORY tag is used to specify the (relative or absolute) # base path where the generated documentation will be put.