Fix numerous typos in documentation (goodness)
This commit is contained in:
@@ -273,10 +273,10 @@ Then four object files will be generated: ``foo_sse2.o``, ``foo_sse4.o``,
|
|||||||
``foo_avx.o``, and ``foo.o``.[#]_ Link all of these into your executable, and
|
``foo_avx.o``, and ``foo.o``.[#]_ Link all of these into your executable, and
|
||||||
when you call a function in ``foo.ispc`` from your application code,
|
when you call a function in ``foo.ispc`` from your application code,
|
||||||
``ispc`` will determine which instruction sets are supported by the CPU the
|
``ispc`` will determine which instruction sets are supported by the CPU the
|
||||||
code is running on and will call the most appropraite version of the
|
code is running on and will call the most appropriate version of the
|
||||||
function available.
|
function available.
|
||||||
|
|
||||||
.. [#] Similarly, if you choose to generate assembly langauage output or
|
.. [#] Similarly, if you choose to generate assembly language output or
|
||||||
LLVM bitcode output, multiple versions of those files will be created.
|
LLVM bitcode output, multiple versions of those files will be created.
|
||||||
|
|
||||||
In general, the version of the function that runs will be the one in the
|
In general, the version of the function that runs will be the one in the
|
||||||
|
|||||||
@@ -26,9 +26,9 @@ The main goals behind ``ispc`` are to:
|
|||||||
units without the extremely low-programmer-productivity activity of directly
|
units without the extremely low-programmer-productivity activity of directly
|
||||||
writing intrinsics.
|
writing intrinsics.
|
||||||
* Explore opportunities from close-coupling between C/C++ application code
|
* Explore opportunities from close-coupling between C/C++ application code
|
||||||
and SPMD ``ispc`` code running on the same processor--lightweight funcion
|
and SPMD ``ispc`` code running on the same processor--lightweight function
|
||||||
calls betwen the two languages, sharing data directly via pointers without
|
calls between the two languages, sharing data directly via pointers without
|
||||||
copying or reformating, etc.
|
copying or reformatting, etc.
|
||||||
|
|
||||||
**We are very interested in your feedback and comments about ispc and
|
**We are very interested in your feedback and comments about ispc and
|
||||||
in hearing your experiences using the system. We are especially interested
|
in hearing your experiences using the system. We are especially interested
|
||||||
@@ -249,7 +249,7 @@ of the value.
|
|||||||
The first thing to notice in this program is the presence of the ``export``
|
The first thing to notice in this program is the presence of the ``export``
|
||||||
keyword in the function definition; this indicates that the function should
|
keyword in the function definition; this indicates that the function should
|
||||||
be made available to be called from application code. The ``uniform``
|
be made available to be called from application code. The ``uniform``
|
||||||
qualifiers on the parameters to ``simple`` indicate that the correpsonding
|
qualifiers on the parameters to ``simple`` indicate that the corresponding
|
||||||
variables are non-vector quantities--this concept is discussed in detail in the
|
variables are non-vector quantities--this concept is discussed in detail in the
|
||||||
`"uniform" and "varying" Qualifiers`_ section.
|
`"uniform" and "varying" Qualifiers`_ section.
|
||||||
|
|
||||||
@@ -321,7 +321,7 @@ When the executable ``simple`` runs, it generates the expected output:
|
|||||||
...
|
...
|
||||||
|
|
||||||
For a slightly more complex example of using ``ispc``, see the `Mandelbrot
|
For a slightly more complex example of using ``ispc``, see the `Mandelbrot
|
||||||
set example`_ page on the ``ispc`` website for a walkthrough of an ``ispc``
|
set example`_ page on the ``ispc`` website for a walk-through of an ``ispc``
|
||||||
implementation of that algorithm. After reading through that example, you
|
implementation of that algorithm. After reading through that example, you
|
||||||
may want to examine the source code of the various examples in the
|
may want to examine the source code of the various examples in the
|
||||||
``examples/`` directory of the ``ispc`` distribution.
|
``examples/`` directory of the ``ispc`` distribution.
|
||||||
@@ -372,7 +372,7 @@ Optimizations are on by default; they can be turned off with ``-O0``:
|
|||||||
On Mac\* and Linux\*, there is basic support for generating debugging
|
On Mac\* and Linux\*, there is basic support for generating debugging
|
||||||
symbols; this is enabled with the ``-g`` command-line flag. Using ``-g``
|
symbols; this is enabled with the ``-g`` command-line flag. Using ``-g``
|
||||||
causes optimizations to be disabled; to compile with debugging symbols and
|
causes optimizations to be disabled; to compile with debugging symbols and
|
||||||
optimizaion, ``-O1`` should be provided as well as the ``-g`` flag.
|
optimization, ``-O1`` should be provided as well as the ``-g`` flag.
|
||||||
|
|
||||||
The ``-h`` flag can also be used to direct ``ispc`` to generate a C/C++
|
The ``-h`` flag can also be used to direct ``ispc`` to generate a C/C++
|
||||||
header file that includes C/C++ declarations of the C-callable ``ispc``
|
header file that includes C/C++ declarations of the C-callable ``ispc``
|
||||||
@@ -610,7 +610,7 @@ side-effects.
|
|||||||
|
|
||||||
Upon entry to an ``ispc`` function called by the application, the execution
|
Upon entry to an ``ispc`` function called by the application, the execution
|
||||||
mask is "all on" and the program counter points at the first statement in
|
mask is "all on" and the program counter points at the first statement in
|
||||||
the function. The following two statments describe the required behavior
|
the function. The following two statements describe the required behavior
|
||||||
of the program counter and the execution mask over the course of execution
|
of the program counter and the execution mask over the course of execution
|
||||||
of an ``ispc`` function.
|
of an ``ispc`` function.
|
||||||
|
|
||||||
@@ -731,7 +731,7 @@ program instances is *maximally converged*. Maximal convergence means that
|
|||||||
if two program instances follow the same control path, they are guaranteed
|
if two program instances follow the same control path, they are guaranteed
|
||||||
to execute each program statement concurrently. If two program instances
|
to execute each program statement concurrently. If two program instances
|
||||||
follow diverging control paths, it is guaranteed that they will reconverge
|
follow diverging control paths, it is guaranteed that they will reconverge
|
||||||
as soon as possible (if they do later reconverge). [#]_
|
as soon as possible in the function (if they do later reconverge). [#]_
|
||||||
|
|
||||||
.. [#] This is another significant difference between the ``ispc``
|
.. [#] This is another significant difference between the ``ispc``
|
||||||
execution model and the one implemented by OpenCL* and CUDA*, which
|
execution model and the one implemented by OpenCL* and CUDA*, which
|
||||||
@@ -819,7 +819,7 @@ of control flow, will say that control flow based on ``varying``
|
|||||||
expressions is "varying" control flow.)
|
expressions is "varying" control flow.)
|
||||||
|
|
||||||
Consider for example an image filtering operation where the program loops
|
Consider for example an image filtering operation where the program loops
|
||||||
over pixels adjacent to the given (x,y) coordiantes:
|
over pixels adjacent to the given (x,y) coordinates:
|
||||||
|
|
||||||
::
|
::
|
||||||
|
|
||||||
@@ -919,7 +919,7 @@ for all program instances in the gang, it's possible that the "true" clause
|
|||||||
executed with an "all off" mask and ``b`` was modified there.
|
executed with an "all off" mask and ``b`` was modified there.
|
||||||
|
|
||||||
If it is important that code never be executed with an "all off" execution
|
If it is important that code never be executed with an "all off" execution
|
||||||
mask, then the ``cif`` statment (documented in the `"Coherent" Control Flow
|
mask, then the ``cif`` statement (documented in the `"Coherent" Control Flow
|
||||||
Statements: "cif" and Friends`_ section) can be used in place of a regular
|
Statements: "cif" and Friends`_ section) can be used in place of a regular
|
||||||
``if``, as it guarantees this property.
|
``if``, as it guarantees this property.
|
||||||
|
|
||||||
@@ -1045,7 +1045,7 @@ completed.
|
|||||||
The ISPC Language
|
The ISPC Language
|
||||||
=================
|
=================
|
||||||
|
|
||||||
``ispc`` is an extended verion of the C programming language, providing a
|
``ispc`` is an extended version of the C programming language, providing a
|
||||||
number of new features that make it easy to write high-performance SPMD
|
number of new features that make it easy to write high-performance SPMD
|
||||||
programs for the CPU. Note that between not only the few small syntactic
|
programs for the CPU. Note that between not only the few small syntactic
|
||||||
differences between ``ispc`` and C code but more importantly ``ispc``'s
|
differences between ``ispc`` and C code but more importantly ``ispc``'s
|
||||||
@@ -1066,12 +1066,12 @@ This subsection summarizes the differences between ``ispc`` and C; if you
|
|||||||
are already familiar with C, you may find it most effective to focus on
|
are already familiar with C, you may find it most effective to focus on
|
||||||
this subsection and just focus on the topics in the remainder of section
|
this subsection and just focus on the topics in the remainder of section
|
||||||
that introduce new language features. You may also find it helpful to
|
that introduce new language features. You may also find it helpful to
|
||||||
comapre the ``ispc`` and C++ implementations of various algorithms in the
|
compare the ``ispc`` and C++ implementations of various algorithms in the
|
||||||
``ispc`` ``examples/`` directory to get a sense of the close relationship
|
``ispc`` ``examples/`` directory to get a sense of the close relationship
|
||||||
between ``ispc`` and C.
|
between ``ispc`` and C.
|
||||||
|
|
||||||
Specifically, C89 is used as the baseline for comparison in this subsection
|
Specifically, C89 is used as the baseline for comparison in this subsection
|
||||||
(this is also the verion of C described in the Second Edition of Kernighan
|
(this is also the version of C described in the Second Edition of Kernighan
|
||||||
and Ritchie's book). (``ispc`` adopts some features from C99 and from C++,
|
and Ritchie's book). (``ispc`` adopts some features from C99 and from C++,
|
||||||
which will be highlighted in the below.)
|
which will be highlighted in the below.)
|
||||||
|
|
||||||
@@ -1099,7 +1099,7 @@ in C:
|
|||||||
statement itself (e.g. ``for (int i = 0; ...``)
|
statement itself (e.g. ``for (int i = 0; ...``)
|
||||||
* The ``inline`` qualifier to indicate that a function should be inlined
|
* The ``inline`` qualifier to indicate that a function should be inlined
|
||||||
* Function overloading by parameter type
|
* Function overloading by parameter type
|
||||||
* Hexidecimal floating-point constants
|
* Hexadecimal floating-point constants
|
||||||
|
|
||||||
``ispc`` also adds a number of new features that aren't in C89, C99, or
|
``ispc`` also adds a number of new features that aren't in C89, C99, or
|
||||||
C++:
|
C++:
|
||||||
@@ -1158,11 +1158,11 @@ The following reserved words from C89 are also reserved in ``ispc``:
|
|||||||
Lexical Structure
|
Lexical Structure
|
||||||
-----------------
|
-----------------
|
||||||
|
|
||||||
Tokens in ``ispc`` are delimted by white-space and comments. The
|
Tokens in ``ispc`` are delimited by white-space and comments. The
|
||||||
white-space characters are the usual set of spaces, tabs, and carriage
|
white-space characters are the usual set of spaces, tabs, and carriage
|
||||||
returns/line feeds. Comments can be delinated with ``//``, which starts a
|
returns/line feeds. Comments can be delineated with ``//``, which starts a
|
||||||
comment that continues to the end of the line, or the start of a comment
|
comment that continues to the end of the line, or the start of a comment
|
||||||
can be delinated with ``/*`` and the end with ``*/``. Like C/C++,
|
can be delineated with ``/*`` and the end with ``*/``. Like C/C++,
|
||||||
comments can't be nested.
|
comments can't be nested.
|
||||||
|
|
||||||
Identifiers in ``ispc`` are sequences of characters that start with an
|
Identifiers in ``ispc`` are sequences of characters that start with an
|
||||||
@@ -1170,9 +1170,9 @@ underscore or an upper-case or lower-case letter, and then followed by
|
|||||||
zero or more letters, numbers, or underscores. Identifiers that start with
|
zero or more letters, numbers, or underscores. Identifiers that start with
|
||||||
two underscores are reserved for use by the compiler.
|
two underscores are reserved for use by the compiler.
|
||||||
|
|
||||||
Integer numeric constants can be specified in base 10, hexidecimal, or
|
Integer numeric constants can be specified in base 10, hexadecimal, or
|
||||||
binary. (Octal integer constants aren't supported). Base 10 constants are
|
binary. (Octal integer constants aren't supported). Base 10 constants are
|
||||||
given by a sequence of one or more digits from 0 to 9. Hexidecimal
|
given by a sequence of one or more digits from 0 to 9. Hexadecimal
|
||||||
constants are denoted by a leading ``0x`` and then one or more digits from
|
constants are denoted by a leading ``0x`` and then one or more digits from
|
||||||
0-9, a-f, or A-F. Finally, binary constants are denoted by a leading
|
0-9, a-f, or A-F. Finally, binary constants are denoted by a leading
|
||||||
``0b`` and then a sequence of 1s and 0s.
|
``0b`` and then a sequence of 1s and 0s.
|
||||||
@@ -1194,11 +1194,11 @@ The second option is scientific notation, where a base value is specified
|
|||||||
as the first form of a floating-point constant but is then followed by an
|
as the first form of a floating-point constant but is then followed by an
|
||||||
"e" or "E", then a plus sign or a minus sign, and then an exponent.
|
"e" or "E", then a plus sign or a minus sign, and then an exponent.
|
||||||
|
|
||||||
Finally, floating-point constants may be specified as hexidecimal
|
Finally, floating-point constants may be specified as hexadecimal
|
||||||
constants; this form can ensure a perfectly bit-accurate representation of
|
constants; this form can ensure a perfectly bit-accurate representation of
|
||||||
a particular floating-point number. These are specified with an "0x"
|
a particular floating-point number. These are specified with an "0x"
|
||||||
prefix, followed by a zero or a one, a period, and then the remainder of
|
prefix, followed by a zero or a one, a period, and then the remainder of
|
||||||
the mantissa in hexidecimal form, with digits from 0-9, a-f, or A-F. The
|
the mantissa in hexadecimal form, with digits from 0-9, a-f, or A-F. The
|
||||||
start of the exponent is denoted by a "p", which is then followed by an
|
start of the exponent is denoted by a "p", which is then followed by an
|
||||||
optional plus or minus sign and then digits from 0 to 9. For example:
|
optional plus or minus sign and then digits from 0 to 9. For example:
|
||||||
|
|
||||||
@@ -1235,7 +1235,7 @@ to specify special characters. These sequences all start with an initial
|
|||||||
* - ``\n``
|
* - ``\n``
|
||||||
- newline
|
- newline
|
||||||
* - ``\r``
|
* - ``\r``
|
||||||
- carriabe return
|
- carriage return
|
||||||
* - ``\t``
|
* - ``\t``
|
||||||
- horizontal tab
|
- horizontal tab
|
||||||
* - ``\v``
|
* - ``\v``
|
||||||
@@ -1243,7 +1243,7 @@ to specify special characters. These sequences all start with an initial
|
|||||||
* - ``\`` followed by one or more digits from 0-8
|
* - ``\`` followed by one or more digits from 0-8
|
||||||
- ASCII character in octal notation
|
- ASCII character in octal notation
|
||||||
* - ``\x``, followed by one or more digits from 0-9, a-f, A-F
|
* - ``\x``, followed by one or more digits from 0-9, a-f, A-F
|
||||||
- ASCII character in hexidecimal notation
|
- ASCII character in hexadecimal notation
|
||||||
|
|
||||||
``ispc`` doesn't support a string data type; string constants can be passed
|
``ispc`` doesn't support a string data type; string constants can be passed
|
||||||
as the first argument to the ``print()`` statement, however. ``ispc`` also
|
as the first argument to the ``print()`` statement, however. ``ispc`` also
|
||||||
@@ -1398,7 +1398,7 @@ store are:
|
|||||||
uniform float bar[10];
|
uniform float bar[10];
|
||||||
|
|
||||||
The first declaration corresponds to 10 gang-wide ``float`` values in
|
The first declaration corresponds to 10 gang-wide ``float`` values in
|
||||||
memory, while the second declaration corresonds to 10 ``float`` values.
|
memory, while the second declaration corresponds to 10 ``float`` values.
|
||||||
|
|
||||||
|
|
||||||
Defining New Names For Types
|
Defining New Names For Types
|
||||||
@@ -1562,7 +1562,7 @@ instance in the gang has its own unique pointer value)
|
|||||||
|
|
||||||
(The rationale for this limitation is that references must be represented
|
(The rationale for this limitation is that references must be represented
|
||||||
as either a uniform pointer or a varying pointer internally. While
|
as either a uniform pointer or a varying pointer internally. While
|
||||||
choosing a varying pointer would provide maximum flexibilty and eliminate
|
choosing a varying pointer would provide maximum flexibility and eliminate
|
||||||
this restriction, it would reduce performance in the common case where a
|
this restriction, it would reduce performance in the common case where a
|
||||||
uniform pointer is all that's needed. As a work-around, a varying pointer
|
uniform pointer is all that's needed. As a work-around, a varying pointer
|
||||||
can be used in cases where a varying lvalue reference would be desired.)
|
can be used in cases where a varying lvalue reference would be desired.)
|
||||||
@@ -1585,7 +1585,7 @@ and then a brace-delimited list of enumerators with optional values:
|
|||||||
|
|
||||||
Each ``enum`` declaration defines a new type; an attempt to implicitly
|
Each ``enum`` declaration defines a new type; an attempt to implicitly
|
||||||
convert between enumerations of different types gives a compile-time error,
|
convert between enumerations of different types gives a compile-time error,
|
||||||
but enuemrations of different types can be explicitly cast to one other.
|
but enumerations of different types can be explicitly cast to one other.
|
||||||
|
|
||||||
::
|
::
|
||||||
|
|
||||||
@@ -1595,7 +1595,7 @@ Enumerators are implicitly converted to integer types, however, so they can
|
|||||||
be directly passed to routines that take integer parameters and can be used
|
be directly passed to routines that take integer parameters and can be used
|
||||||
in expressions including integers, for example. However, the integer
|
in expressions including integers, for example. However, the integer
|
||||||
result of such an expression must be explicitly cast back to the enumerant
|
result of such an expression must be explicitly cast back to the enumerant
|
||||||
type if it to be assigned to a variable with the enuemrant type.
|
type if it to be assigned to a variable with the enumerant type.
|
||||||
|
|
||||||
::
|
::
|
||||||
|
|
||||||
@@ -1846,7 +1846,7 @@ Structures can also be initialized by providing element values in braces:
|
|||||||
....
|
....
|
||||||
Color d = { 0.5, .75, 1.0 }; // r = 0.5, ...
|
Color d = { 0.5, .75, 1.0 }; // r = 0.5, ...
|
||||||
|
|
||||||
Arrays of structures and arrays inside structures can be initialzed with
|
Arrays of structures and arrays inside structures can be initialized with
|
||||||
the expected syntax:
|
the expected syntax:
|
||||||
|
|
||||||
::
|
::
|
||||||
@@ -1880,7 +1880,7 @@ Structure member access and array indexing also work as in C.
|
|||||||
return foo.f[4] - foo.i;
|
return foo.f[4] - foo.i;
|
||||||
|
|
||||||
|
|
||||||
The address-of operator, pointer derefernce operator, and pointer member
|
The address-of operator, pointer dereference operator, and pointer member
|
||||||
operator also work as expected.
|
operator also work as expected.
|
||||||
|
|
||||||
::
|
::
|
||||||
@@ -1925,7 +1925,7 @@ Basic Iteration Statements: "for", "while", and "do"
|
|||||||
|
|
||||||
``ispc`` supports ``for``, ``while``, and ``do`` loops, with the same
|
``ispc`` supports ``for``, ``while``, and ``do`` loops, with the same
|
||||||
specification as in C. Like C++, variables can be declared in the ``for``
|
specification as in C. Like C++, variables can be declared in the ``for``
|
||||||
statment itself:
|
statement itself:
|
||||||
|
|
||||||
::
|
::
|
||||||
|
|
||||||
@@ -2009,7 +2009,7 @@ nested inside a ``foreach`` loop.) ``continue`` statements are legal in
|
|||||||
a program instances that executes a ``continue`` statement effectively
|
a program instances that executes a ``continue`` statement effectively
|
||||||
skips over the rest of the loop body for the current iteration.
|
skips over the rest of the loop body for the current iteration.
|
||||||
|
|
||||||
As a specific example, consdier the following ``foreach`` statement:
|
As a specific example, consider the following ``foreach`` statement:
|
||||||
|
|
||||||
::
|
::
|
||||||
|
|
||||||
@@ -2107,7 +2107,7 @@ some computation on an array of data.
|
|||||||
}
|
}
|
||||||
|
|
||||||
Here, we've written a loop that explicitly loops over the data in chunks of
|
Here, we've written a loop that explicitly loops over the data in chunks of
|
||||||
``programCount`` elements. In each loop iteraton, the running program
|
``programCount`` elements. In each loop iteration, the running program
|
||||||
instances effectively collude amongst themselves using ``programIndex`` to
|
instances effectively collude amongst themselves using ``programIndex`` to
|
||||||
determine which elements to work on in a way that ensures that all of the
|
determine which elements to work on in a way that ensures that all of the
|
||||||
data elements will be processed. In this particular case, a ``foreach``
|
data elements will be processed. In this particular case, a ``foreach``
|
||||||
@@ -2313,7 +2313,7 @@ distributions.
|
|||||||
If you are implementing your own task system, the remainder of this section
|
If you are implementing your own task system, the remainder of this section
|
||||||
discusses the requirements for these calls. You will also likely want to
|
discusses the requirements for these calls. You will also likely want to
|
||||||
review the example task systems in ``examples/tasksys.cpp`` for reference.
|
review the example task systems in ``examples/tasksys.cpp`` for reference.
|
||||||
If you are not implmenting your own task system, you can skip reading the
|
If you are not implementing your own task system, you can skip reading the
|
||||||
remainder of this section.
|
remainder of this section.
|
||||||
|
|
||||||
Here are the declarations of the three functions that must be provided to
|
Here are the declarations of the three functions that must be provided to
|
||||||
@@ -2333,7 +2333,7 @@ implementation can efficiently wait for completion on just the tasks
|
|||||||
launched from a single function.
|
launched from a single function.
|
||||||
|
|
||||||
The first time one of ``ISPCLaunch()`` or ``ISPCAlloc()`` is called in an
|
The first time one of ``ISPCLaunch()`` or ``ISPCAlloc()`` is called in an
|
||||||
``ispc`` functon, the ``void *`` pointed to by the ``handlePtr`` parameter
|
``ispc`` function, the ``void *`` pointed to by the ``handlePtr`` parameter
|
||||||
will be ``NULL``. The implementations of these function should then
|
will be ``NULL``. The implementations of these function should then
|
||||||
initialize ``*handlePtr`` to a unique handle value of some sort. (For
|
initialize ``*handlePtr`` to a unique handle value of some sort. (For
|
||||||
example, it might allocate a small structure to record which tasks were
|
example, it might allocate a small structure to record which tasks were
|
||||||
@@ -2349,14 +2349,14 @@ than a pointer to it, as in the other functions.
|
|||||||
|
|
||||||
The ``ISPCAlloc()`` function is used to allocate small blocks of memory to
|
The ``ISPCAlloc()`` function is used to allocate small blocks of memory to
|
||||||
store parameters passed to tasks. It should return a pointer to memory
|
store parameters passed to tasks. It should return a pointer to memory
|
||||||
with the given aize and alignment. Note that there is no explicit
|
with the given size and alignment. Note that there is no explicit
|
||||||
``ISPCFree()`` call; instead, all memory allocated within an ``ispc``
|
``ISPCFree()`` call; instead, all memory allocated within an ``ispc``
|
||||||
function should be freed when ``ISPCSync()`` is called.
|
function should be freed when ``ISPCSync()`` is called.
|
||||||
|
|
||||||
``ISPCLaunch()`` is called to launch to launch one or more asynchronous
|
``ISPCLaunch()`` is called to launch to launch one or more asynchronous
|
||||||
tasks. Each ``launch`` statement in ``ispc`` code causes a call to
|
tasks. Each ``launch`` statement in ``ispc`` code causes a call to
|
||||||
``ISPCLaunch()`` to be emitted in the generated code. The three parameters
|
``ISPCLaunch()`` to be emitted in the generated code. The three parameters
|
||||||
after the handle pointer to thie function are relatively straightforward;
|
after the handle pointer to the function are relatively straightforward;
|
||||||
the ``void *f`` parameter holds a pointer to a function to call to run the
|
the ``void *f`` parameter holds a pointer to a function to call to run the
|
||||||
work for this task, ``data`` holds a pointer to data to pass to this
|
work for this task, ``data`` holds a pointer to data to pass to this
|
||||||
function, and ``count`` is the number of instances of this function to
|
function, and ``count`` is the number of instances of this function to
|
||||||
@@ -2371,7 +2371,7 @@ The signature of the provided function pointer ``f`` is
|
|||||||
int taskIndex, int taskCount)
|
int taskIndex, int taskCount)
|
||||||
|
|
||||||
When this function pointer is called by one of the hardware threads managed
|
When this function pointer is called by one of the hardware threads managed
|
||||||
bythe task system, the ``data`` pointer passed to ``ISPCLaunch()`` should
|
by the task system, the ``data`` pointer passed to ``ISPCLaunch()`` should
|
||||||
be passed to it for its first parameter; ``threadCount`` gives the total
|
be passed to it for its first parameter; ``threadCount`` gives the total
|
||||||
number of hardware threads that have been spawned to run tasks and
|
number of hardware threads that have been spawned to run tasks and
|
||||||
``threadIndex`` should be an integer index between zero and ``threadCount``
|
``threadIndex`` should be an integer index between zero and ``threadCount``
|
||||||
@@ -2690,7 +2690,7 @@ generates the following output on a four-wide compilation target:
|
|||||||
When a varying variable is printed, the values for program instances that
|
When a varying variable is printed, the values for program instances that
|
||||||
aren't currently executing are printed inside double parenthesis,
|
aren't currently executing are printed inside double parenthesis,
|
||||||
indicating inactive program instances. The elements for inactive program
|
indicating inactive program instances. The elements for inactive program
|
||||||
instances may have garabge values, though in some circumstances it can be
|
instances may have garbage values, though in some circumstances it can be
|
||||||
useful to see their values.
|
useful to see their values.
|
||||||
|
|
||||||
Assertions
|
Assertions
|
||||||
@@ -2910,7 +2910,7 @@ If called when none of the program instances are running,
|
|||||||
There are also a number of functions to compute "scan"s of values across
|
There are also a number of functions to compute "scan"s of values across
|
||||||
the program instances. For example, the ``exclusive_scan_and()`` function
|
the program instances. For example, the ``exclusive_scan_and()`` function
|
||||||
computes, for each program instance, the sum of the given value over all of
|
computes, for each program instance, the sum of the given value over all of
|
||||||
the preceeding program instances. (The scans currently available in
|
the preceding program instances. (The scans currently available in
|
||||||
``ispc`` are all so-called "exclusive" scans, meaning that the value
|
``ispc`` are all so-called "exclusive" scans, meaning that the value
|
||||||
computed for a given element does not include the value provided for that
|
computed for a given element does not include the value provided for that
|
||||||
element.) In C code, an exclusive add scan over an array might be
|
element.) In C code, an exclusive add scan over an array might be
|
||||||
@@ -3206,7 +3206,7 @@ rather than one per program instance.
|
|||||||
uniform int32 newval)
|
uniform int32 newval)
|
||||||
|
|
||||||
Be careful that you use the atomic function that you mean to; consider the
|
Be careful that you use the atomic function that you mean to; consider the
|
||||||
folloiwng code:
|
following code:
|
||||||
|
|
||||||
::
|
::
|
||||||
|
|
||||||
@@ -3563,7 +3563,7 @@ Restructuring Existing Programs to Use ISPC
|
|||||||
|
|
||||||
``ispc`` is designed to enable you to incorporate
|
``ispc`` is designed to enable you to incorporate
|
||||||
SPMD parallelism into existing code with minimal modification; features
|
SPMD parallelism into existing code with minimal modification; features
|
||||||
like the ability to share memory and data structures betwen C/C++ and
|
like the ability to share memory and data structures between C/C++ and
|
||||||
``ispc`` code and the ability to directly call back and forth between
|
``ispc`` code and the ability to directly call back and forth between
|
||||||
``ispc`` and C/C++ are motivated by this. These features also make it
|
``ispc`` and C/C++ are motivated by this. These features also make it
|
||||||
easy to incrementally transform a program to use ``ispc``; the most
|
easy to incrementally transform a program to use ``ispc``; the most
|
||||||
|
|||||||
@@ -64,7 +64,7 @@ on each one:
|
|||||||
Depending on the specifics of the computation being performed, the code
|
Depending on the specifics of the computation being performed, the code
|
||||||
generated for this function could likely be improved by modifying the code
|
generated for this function could likely be improved by modifying the code
|
||||||
so that the loop only goes as far through the data as is possible to pack
|
so that the loop only goes as far through the data as is possible to pack
|
||||||
an entire gang of program instances with computation each time thorugh the
|
an entire gang of program instances with computation each time through the
|
||||||
loop. Doing so enables the ``ispc`` compiler to generate more efficient
|
loop. Doing so enables the ``ispc`` compiler to generate more efficient
|
||||||
code for cases where it knows that the execution mask is "all on". Then,
|
code for cases where it knows that the execution mask is "all on". Then,
|
||||||
an ``if`` statement at the end handles processing the ragged extra bits of
|
an ``if`` statement at the end handles processing the ragged extra bits of
|
||||||
@@ -153,7 +153,7 @@ processed, and so forth.
|
|||||||
|
|
||||||
Performance benefit can come from using ``foreach_tiled`` in that it
|
Performance benefit can come from using ``foreach_tiled`` in that it
|
||||||
essentially optimizes for the benefit of iterating over *compact* regions
|
essentially optimizes for the benefit of iterating over *compact* regions
|
||||||
of the domian (while ``foreach`` iterates over the domain in a way that
|
of the domain (while ``foreach`` iterates over the domain in a way that
|
||||||
generally allows linear memory access.) There are two benefits from
|
generally allows linear memory access.) There are two benefits from
|
||||||
processing compact regions of the domain.
|
processing compact regions of the domain.
|
||||||
|
|
||||||
@@ -215,7 +215,7 @@ Use "uniform" Whenever Appropriate
|
|||||||
----------------------------------
|
----------------------------------
|
||||||
|
|
||||||
For any variable that will always have the same value across all of the
|
For any variable that will always have the same value across all of the
|
||||||
program instances in a gang, declare the variable with the ``unfiorm``
|
program instances in a gang, declare the variable with the ``uniform``
|
||||||
qualifier. Doing so enables the ``ispc`` compiler to emit better code in
|
qualifier. Doing so enables the ``ispc`` compiler to emit better code in
|
||||||
many different ways.
|
many different ways.
|
||||||
|
|
||||||
@@ -229,7 +229,7 @@ number of iterations:
|
|||||||
|
|
||||||
If this is written with ``i`` as a ``varying`` variable, as above, there's
|
If this is written with ``i`` as a ``varying`` variable, as above, there's
|
||||||
additional overhead in the code generated for the loop as the compiler
|
additional overhead in the code generated for the loop as the compiler
|
||||||
emits instructions to handle the possibilty of not all program instances
|
emits instructions to handle the possibility of not all program instances
|
||||||
following the same control flow path (as might be the case if the loop
|
following the same control flow path (as might be the case if the loop
|
||||||
limit, 10, was itself a ``varying`` value.)
|
limit, 10, was itself a ``varying`` value.)
|
||||||
|
|
||||||
@@ -568,7 +568,7 @@ mask of all lanes currently executing (assuming a four-wide gang size
|
|||||||
target machine).
|
target machine).
|
||||||
|
|
||||||
For a fuller example of the utility of this functionality, see
|
For a fuller example of the utility of this functionality, see
|
||||||
``examples/aobench_instrumented`` in the ``ispc`` distribution. Ths
|
``examples/aobench_instrumented`` in the ``ispc`` distribution. This
|
||||||
example includes an implementation of the ``ISPCInstrument()`` function
|
example includes an implementation of the ``ISPCInstrument()`` function
|
||||||
that collects aggregate data about the program's execution behavior.
|
that collects aggregate data about the program's execution behavior.
|
||||||
|
|
||||||
|
|||||||
Reference in New Issue
Block a user