Add support for broadcast(), rotate(), and shuffle() stdlib routines

This commit is contained in:
Matt Pharr
2011-06-27 17:31:44 -07:00
parent 36063bae79
commit 2709c354d7
15 changed files with 329 additions and 55 deletions

View File

@@ -74,7 +74,8 @@ Contents:
+ `Math Functions`_
+ `Output Functions`_
+ `Cross-Lane Operations`_
+ `Cross-Program Instance Operations`_
+ `Packed Load and Store Operations`_
+ `Low-Level Bits`_
* `Interoperability with the Application`_
@@ -1659,14 +1660,14 @@ values for the inactive program instances aren't printed. (In other cases,
they may have garbage values or be otherwise undefined.)
Cross-Lane Operations
---------------------
Cross-Program Instance Operations
---------------------------------
Usually, ``ispc`` code expresses independent computation on separate data
elements. There are, however, a number of cases where it's useful for the
program instances to be able to cooperate in computing results. The
cross-lane operations described in this section provide primitives for
communication between the running program instances.
Usually, ``ispc`` code expresses independent programs performing
computation on separate data elements. There are, however, a number of
cases where it's useful for the program instances to be able to cooperate
in computing results. The cross-lane operations described in this section
provide primitives for communication between the running program instances.
A few routines that evaluate conditions across the running program
instances. For example, ``any()`` returns ``true`` if the given value
@@ -1678,6 +1679,47 @@ and ``all()`` returns ``true`` if it true for all of them.
uniform bool any(bool v)
uniform bool all(bool v)
To broadcast a value from one program instance to all of the others, a
``broadcast()`` function is available. It broadcasts the value of the
``value`` parameter for the program instance given by ``index`` to all of
the running program instances.
::
float broadcast(float value, uniform int index)
int32 broadcast(int32 value, uniform int index)
double broadcast(double value, uniform int index)
int64 broadcast(int64 value, uniform int index)
The ``rotate()`` function allows each program instance to find the value of
the given value that their neighbor ``offset`` steps away has. For
example, on an 8-wide target, if ``offset`` has the value (1, 2, 3, 4, 5,
6, 7, 8) in each of the running program instances, then ``rotate(value,
-1)`` causes the first program instance to get the value 8, the second
program instance to get the value 1, the third 2, and so forth. The
provided offset value can be positive or negative, and may be greater than
``programCount`` (it is masked to ensure valid offsets).
::
float rotate(float value, uniform int offset)
int32 rotate(int32 value, uniform int offset)
double rotate(double value, uniform int offset)
int64 rotate(int64 value, uniform int offset)
Finally, ``shuffle()`` allows fully general shuffling of values among the
program instances. Each program instance's value of permutation gives the
program instance from which to get the value of ``value``. The provided
values for ``permutation`` must all be between 0 and ``programCount-1``.
::
float shuffle(float value, int permutation)
int32 shuffle(int32 value, int permutation)
double shuffle(double value, int permutation)
int64 shuffle(int64 value, int permutation)
The various variants of ``popcnt()`` return the population count--the
number of bits set in the given value.
@@ -1719,8 +1761,12 @@ given value across all of the currently-executing vector lanes.
uniform unsigned int reduce_max(unsigned int a, unsigned int b)
Finally, there are routines for writing out and reading in values from
linear memory locations for the active program instances.
Packed Load and Store Operations
--------------------------------
The standard library also offers routines for writing out and reading in
values from linear memory locations for the active program instances.
``packed_load_active()`` loads consecutive values from the given array,
starting at ``a[offset]``, loading one value for each currently-executing
program instance and storing it into that program instance's ``val``
@@ -2280,21 +2326,11 @@ elements to work with and then proceeds with the computation.
Communicating Between SPMD Program Instances
--------------------------------------------
The ``programIndex`` built-in variable (see `Mapping Data To Program
Instances`_) can be used to communicate between the set of executing
program instances. Consider the following code, which shows all of the
program instances writing into unique locations in an array.
::
float x = ...;
uniform float allX[programCount];
allX[programIndex] = x;
In this code, a program instance that reads ``allX[0]`` finds the value of
``x`` that was computed by the first of the running program instances, and
so forth. Program instances can communicate with their neighbor instances
with indexing like ``allX[(programIndex+1)%programCount]``.
The ``broadcast()``, ``rotate()``, and ``shuffle()`` standard library
routiens provide a variety of mechanisms for the running program instances
to communicate values to each other during execution. See the section
`Cross-Program Instance Operations`_ for more information about their
operation.
Gather and Scatter