added description for multi-dimensional tasking
This commit is contained in:
@@ -3012,8 +3012,8 @@ Intel® Cilk(tm), Intel® Thread Building Blocks or another task system), and
|
||||
for tasks to use ``ispc`` for SPMD parallelism across the vector lanes as
|
||||
appropriate. Alternatively, ``ispc`` also has support for launching tasks
|
||||
from ``ispc`` code. The approach is similar to Intel® Cilk's task launch
|
||||
feature. (See the ``examples/mandelbrot_tasks`` example to see it used in
|
||||
a small example.)
|
||||
feature. (See the ``examples/mandelbrot_tasks`` and
|
||||
``examples/mandelbrot_tasks3d`` examples to see it used in a small example.)
|
||||
|
||||
Any function that is launched as a task must be declared with the
|
||||
``task`` qualifier:
|
||||
@@ -3108,6 +3108,38 @@ executing the current task. The ``threadIndex`` can be used for accessing
|
||||
data that is private to the current thread and thus doesn't require
|
||||
synchronization to access under parallel execution.
|
||||
|
||||
The tasking system also supports multi-dimensional partitioning (currently up
|
||||
to three dimensions). To launch a 3D grid of tasks, for example with ``N0``,
|
||||
``N1`` and ``N2`` tasks in x-, y- and z-dimension respectively
|
||||
|
||||
::
|
||||
|
||||
float data[N2][N1][N0]
|
||||
task void foo_task()
|
||||
{
|
||||
data[taskIndex2][taskIndex1][threadIndex0] = taskIndex;
|
||||
}
|
||||
|
||||
we use the following ``launch`` expressions:
|
||||
|
||||
::
|
||||
|
||||
launch [N2][N1][N0] foo_task()
|
||||
|
||||
or
|
||||
|
||||
::
|
||||
|
||||
launch [N0,N1,N2] foo_task()
|
||||
|
||||
Value of ``taskIndex`` is equal to ``taskIndex0 + taskCount0*(taskIndex1 +
|
||||
taskCount1*taskIndex2)`` and it ranges from ``0`` to ``taskCount-1``, where
|
||||
``taskCount = taskCount0*taskCount1*taskCount2``. If ``N1`` or/and ``N2`` are
|
||||
not specified in the ``launch`` expression, a value of ``1`` is assumed.
|
||||
Finally, for an one-dimensional grid of tasks, ``taskIndex`` is equivalent to
|
||||
``taskIndex0`` and ``taskCount`` is equivalent to ``taskCount0``.
|
||||
|
||||
|
||||
Task Parallelism: Runtime Requirements
|
||||
--------------------------------------
|
||||
|
||||
@@ -3138,7 +3170,7 @@ manage tasks in ``ispc``:
|
||||
::
|
||||
|
||||
void *ISPCAlloc(void **handlePtr, int64_t size, int32_t alignment);
|
||||
void ISPCLaunch(void **handlePtr, void *f, void *data, int count);
|
||||
void ISPCLaunch(void **handlePtr, void *f, void *data, int count0, int count1, int count2);
|
||||
void ISPCSync(void *handle);
|
||||
|
||||
All three of these functions take an opaque handle (or a pointer to an
|
||||
@@ -3175,16 +3207,20 @@ tasks. Each ``launch`` statement in ``ispc`` code causes a call to
|
||||
after the handle pointer to the function are relatively straightforward;
|
||||
the ``void *f`` parameter holds a pointer to a function to call to run the
|
||||
work for this task, ``data`` holds a pointer to data to pass to this
|
||||
function, and ``count`` is the number of instances of this function to
|
||||
enqueue for asynchronous execution. (In other words, ``count`` corresponds
|
||||
to the value ``n`` in a multiple-task launch statement like ``launch[n]``.)
|
||||
function, and ``count0``, ``count1`` and ``count2`` are the number of instances
|
||||
of this function to enqueue for asynchronous execution. (In other words,
|
||||
``count0``, ``count1`` and ``count2`` correspond to the value ``n0``, ``n1``
|
||||
and ``n2`` in a multiple-task launch statement like ``launch[n2][n1][n0]`` or
|
||||
``launch [n0,n1,n2]`` respectively.)
|
||||
|
||||
The signature of the provided function pointer ``f`` is
|
||||
|
||||
::
|
||||
|
||||
void (*TaskFuncPtr)(void *data, int threadIndex, int threadCount,
|
||||
int taskIndex, int taskCount)
|
||||
int taskIndex, int taskCount,
|
||||
int taskIndex0, int taskIndex1, int taskIndex2,
|
||||
int taskCount0, int taskCount1, int taskCount2);
|
||||
|
||||
When this function pointer is called by one of the hardware threads managed
|
||||
by the task system, the ``data`` pointer passed to ``ISPCLaunch()`` should
|
||||
@@ -3194,11 +3230,14 @@ number of hardware threads that have been spawned to run tasks and
|
||||
uniquely identifying the hardware thread that is running the task. (These
|
||||
values can be used to index into thread-local storage.)
|
||||
|
||||
The value of ``taskCount`` should be the number of tasks launched in the
|
||||
``launch`` statement that caused the call to ``ISPCLaunch()`` and each of
|
||||
the calls to this function should be given a unique value of ``taskIndex``
|
||||
between zero and ``taskCount``, to distinguish which of the instances
|
||||
of the set of launched tasks is running.
|
||||
The value of ``taskCount`` should be the total number of tasks launched in the
|
||||
``launch`` statement (it must be equal to ``taskCount0*taskCount1*taskCount2``)
|
||||
that caused the call to ``ISPCLaunch()`` and each of the calls to this function
|
||||
should be given a unique value of ``taskIndex``, ``taskIndex0``, ``taskIndex1``
|
||||
and ``taskIndex2`` between zero and ``taskCount``, ``taskCount0``,
|
||||
``taskCount1`` and ``taskCount2`` respectively, with ``taskIndex = taskIndex0
|
||||
+ taskCount0*(taskIndex1 + taskCount1*taskIndex2)``, to distinguish which of
|
||||
the instances of the set of launched tasks is running.
|
||||
|
||||
|
||||
|
||||
|
||||
Reference in New Issue
Block a user