Documentation updates for 1.3.0.
This commit is contained in:
@@ -1,3 +1,63 @@
|
||||
=== v1.3.0 === (29 June 2012)
|
||||
|
||||
This is a major new release of ispc, with support for more compilation
|
||||
targets and a number of additions to the language. As usual, the quality
|
||||
of generated code has also been improved in a number of cases and a number
|
||||
of small bugs have been fixed.
|
||||
|
||||
New targets:
|
||||
|
||||
* This release provides "beta" support for compiling to Intel Xeon Phi (the
|
||||
"Many Integrated Core" arthiecture). See
|
||||
http://ispc.github.com/ispc.html#compiling-for-the-intel-xeon-phi-architecture
|
||||
for more details on this support.
|
||||
|
||||
* This release also has an "avx1.1" target, which provides support for the
|
||||
new instructions in the Intel Ivy Bridge microarchitecutre.
|
||||
|
||||
New language features:
|
||||
|
||||
* The foreach_active statement allows iteration over the active program
|
||||
instances in a gang. (See
|
||||
http://ispc.github.com/ispc.html#iteration-over-active-program-instances-foreach-active)
|
||||
|
||||
* foreach_unique allows iterating over subsets of program instances in a
|
||||
gang that share the same value of a variable. (See
|
||||
http://ispc.github.com/ispc.html#iteration-over-unique-elements-foreach-unique)
|
||||
|
||||
* An "unmasked" function qualifier and statement in the language allow
|
||||
re-activating execution of all program instances in a gang. (See
|
||||
http://ispc.github.com/ispc.html#re-establishing-the-execution-mask
|
||||
|
||||
Standard library updates:
|
||||
|
||||
* The seed_rng() function has been modified to take a "varying" seed value
|
||||
when a varying RNGState is being initialized.
|
||||
|
||||
* An isnan() function has been added, to check for floating-point "not a
|
||||
number" values.
|
||||
|
||||
* The float_to_srgb8() routine does high performance conversion of
|
||||
floating-point color values to SRGB8 format.
|
||||
|
||||
Other changes:
|
||||
|
||||
* A number of bugfixes have been made for compiler crashes with malformed
|
||||
programs.
|
||||
|
||||
* Floating-point comparisons are now "unordered", so that any comparison
|
||||
where one of the operands is a "not a number" value returns false. (This
|
||||
matches standard IEEE floating-point behavior.)
|
||||
|
||||
* The code generated for 'break' statements in "varying" loops has been
|
||||
improved for some common cases.
|
||||
|
||||
* Compile time and compiler memory use have both been improved,
|
||||
particularly for large input programs.
|
||||
|
||||
* A nubmer of bugs have been fixed in the debugging information generated
|
||||
by the compiler when the "-g" command-line flag is used.
|
||||
|
||||
=== v1.2.2 === (20 April 2012)
|
||||
|
||||
This release includes a number of small additions to functionality and a
|
||||
|
||||
297
docs/ispc.rst
297
docs/ispc.rst
@@ -46,6 +46,8 @@ Contents:
|
||||
* `Recent Changes to ISPC`_
|
||||
|
||||
+ `Updating ISPC Programs For Changes In ISPC 1.1`_
|
||||
+ `Updating ISPC Programs For Changes In ISPC 1.2`_
|
||||
+ `Updating ISPC Programs For Changes In ISPC 1.3`_
|
||||
|
||||
* `Getting Started with ISPC`_
|
||||
|
||||
@@ -57,6 +59,7 @@ Contents:
|
||||
+ `Basic Command-line Options`_
|
||||
+ `Selecting The Compilation Target`_
|
||||
+ `Generating Generic C++ Output`_
|
||||
+ `Compiling For The Intel Xeon Phi Architecture`_
|
||||
+ `Selecting 32 or 64 Bit Addressing`_
|
||||
+ `The Preprocessor`_
|
||||
+ `Debugging`_
|
||||
@@ -225,6 +228,48 @@ These are the relevant changes to the language:
|
||||
instances. See the Section `Parallel Iteration Statements: "foreach" and
|
||||
"foreach_tiled"`_ for more information about these.
|
||||
|
||||
Updating ISPC Programs For Changes In ISPC 1.2
|
||||
----------------------------------------------
|
||||
|
||||
The following changes were made to the language syntax and semantics for
|
||||
the ``ispc`` 1.2 release:
|
||||
|
||||
* Syntax for the "launch" keyword has been cleaned up; it's now no longer
|
||||
necessary to bracket the launched function call with angle brackets. (In
|
||||
other words, now use ``launch foo();``, rather than ``launch < foo() >;``.)
|
||||
|
||||
* When using pointers, the pointed-to data type is now "uniform" by
|
||||
default. Use the varying keyword to specify varying pointed-to types
|
||||
when needed. (i.e. ``float *ptr`` is a varying pointer to uniform float
|
||||
data, whereas previously it was a varying pointer to varying float
|
||||
values.) Use ``varying float *`` to specify a varying pointer to varying
|
||||
float data, and so forth.
|
||||
|
||||
* The details of "uniform" and "varying" and how they interact with struct
|
||||
types have been cleaned up. Now, when a struct type is declared, if the
|
||||
struct elements don't have explicit "uniform" or "varying" qualifiers,
|
||||
they are said to have "unbound" variability. When a struct type is
|
||||
instantiated, any unbound variability elements inherit the variability of
|
||||
the parent struct type. See `Struct Types`_ for more details.
|
||||
|
||||
* ``ispc`` has a new language feature that makes it much easier to use the
|
||||
efficient "(array of) structure of arrays" (AoSoA, or SoA) memory layout
|
||||
of data. A new ``soa<n>`` qualifier can be applied to structure types to
|
||||
specify an n-wide SoA version of the corresponding type. Array indexing
|
||||
and pointer operations with arrays SoA types automatically handles the
|
||||
two-stage indexing calculation to access the data. See `Structure of
|
||||
Array Types`_ for more details.
|
||||
|
||||
|
||||
Updating ISPC Programs For Changes In ISPC 1.3
|
||||
----------------------------------------------
|
||||
|
||||
This release adds a number of new iteration constructs, which in turn use
|
||||
new reserved words: ``unmasked``, ``foreach_unique``, ``foreach_active``,
|
||||
and ``in``. Any program that happens to have a variable or function with
|
||||
one of these names must be modified to rename that symbol.
|
||||
|
||||
|
||||
Getting Started with ISPC
|
||||
=========================
|
||||
|
||||
@@ -441,11 +486,12 @@ CPU.
|
||||
|
||||
ispc foo.ispc -o foo.obj --cpu=corei7-avx
|
||||
|
||||
Finally, ``--target`` selects between the SSE2, SSE4, and AVX instruction
|
||||
sets. (As general context, SSE2 was first introduced in processors that
|
||||
shipped in 2001, SSE4 was introduced in 2007, and processors with AVX
|
||||
were introduced in 2010. Consult your CPU's manual for specifics on which
|
||||
vector instruction set it supports.)
|
||||
Finally, ``--target`` selects between the SSE2, SSE4, and AVX, and AVX2
|
||||
instruction sets. (As general context, SSE2 was first introduced in
|
||||
processors that shipped in 2001, SSE4 was introduced in 2007, and
|
||||
processors with AVX were introduced in 2010. AVX2 will be supported on
|
||||
future CPUs based on Intel's "Haswell" architecture. Consult your CPU's
|
||||
manual for specifics on which vector instruction set it supports.)
|
||||
|
||||
By default, the target instruction set is chosen based on the most capable
|
||||
one supported by the system on which you're running ``ispc``. You can
|
||||
@@ -513,6 +559,59 @@ C++ file; this can be used to easily include specific implementations of
|
||||
the vector types and functions.
|
||||
|
||||
|
||||
Compiling For The Intel Xeon Phi Architecture
|
||||
---------------------------------------------
|
||||
|
||||
``ispc`` has beta-level support for compiling for the many-core Intel® Xeon
|
||||
Phi architecture (formerly, "Many Integrated Cores" / MIC.) This support
|
||||
is based on the "generic" C++ output, described in the previous section.
|
||||
|
||||
To compile for Xeon Phi, first generate intermediate C++ code:
|
||||
|
||||
::
|
||||
|
||||
ispc foo.ispc --emit-c++ --target=generic-16 -o foo.cpp \
|
||||
--c++-include-file=knc.h
|
||||
|
||||
The ``ispc`` distribution now includes a header file,
|
||||
``examples/intrinsics/knc.h``, which maps from the generic C++ output to
|
||||
the corresponding intrinsic operations for Intel Xeon Phi. Thus, to
|
||||
generate an object file, use the Intel C Compiler (``icc``) compile the C++
|
||||
code generated by ``ispc``, setting the ``#include`` search path so that it
|
||||
can find the ``examples/intrinsics/knc.h`` header file in the ``ispc``
|
||||
distribution.
|
||||
|
||||
With the current beta implementation, complex ``ispc`` programs are able to
|
||||
run on Xeon Phi, though there are a number of known limitations:
|
||||
|
||||
* The ``examples/intrinsics/knc.h`` header file isn’t complete yet; for
|
||||
example, vector operations with ``int8`` and ``int16`` types aren’t yet
|
||||
implemented. Programs that operate on ``varying`` ``int32``, ``float``,
|
||||
and ``double`` data-types (and ``uniform`` variables of any data type,
|
||||
and arrays and structures of these types), should operate correctly.
|
||||
|
||||
* If you use the ``launch`` functionality to launch tasks across cores,
|
||||
note that the pthreads task system implemented in
|
||||
``examples/tasksys.cpp`` hasn’t been tuned for Xeon Phi yet, and has
|
||||
known issues with setting thread affinities optimally.
|
||||
|
||||
* The compiler currently emits unaligned memory accesses in many cases
|
||||
where the memory address is actually aligned. This may unnecessarily
|
||||
impact performance.
|
||||
|
||||
All of these issues are currently actively being addressed and will be
|
||||
fixed in future releases.
|
||||
|
||||
If you do use the current version of ``ispc`` on Xeon Phi, please let us
|
||||
know of any bugs or unexpected results. (Also, any interesting results!).
|
||||
*Note that access to Xeon Phi and public discussion of Xeon Phi performance
|
||||
is still governed by NDA*, so please send email to "matt dot pharr at intel
|
||||
dot com" for any issues that shouldn't be filed in the `public ispc bug
|
||||
tracker`_.
|
||||
|
||||
.. _public ispc bug tracker: https://github.com/ispc/ispc/issues
|
||||
|
||||
|
||||
Selecting 32 or 64 Bit Addressing
|
||||
---------------------------------
|
||||
|
||||
@@ -559,7 +658,7 @@ preprocessor runs:
|
||||
- 1
|
||||
- Major version of the ``ispc`` compiler/language
|
||||
* - ISPC_MINOR_VERSION
|
||||
- 1
|
||||
- 3
|
||||
- Minor version of the ``ispc`` compiler/language
|
||||
* - PI
|
||||
- 3.1415926535
|
||||
@@ -568,17 +667,31 @@ preprocessor runs:
|
||||
Debugging
|
||||
---------
|
||||
|
||||
Support for debugging in ``ispc`` is in progress. On Linux\* and Mac
|
||||
OS\*, the ``-g`` command-line flag can be supplied to the compiler,
|
||||
which causes it to generate debugging symbols. Running ``ispc`` programs
|
||||
in the debugger, setting breakpoints, printing out variables and the like
|
||||
all generally works, though there is occasional unexpected behavior.
|
||||
On Linux\* and Mac OS\*, the ``-g`` command-line flag can be supplied to
|
||||
the compiler, which causes it to generate debugging symbols. Running
|
||||
``ispc`` programs in the debugger, setting breakpoints, printing out
|
||||
variables is just the same as debugging C/C++ programs. Similarly, you can
|
||||
directly step up and down the call stack between ``ispc`` code and C/C++
|
||||
code.
|
||||
|
||||
Another option for debugging (the only current option on Windows\*) is to
|
||||
use the ``print`` statement for ``printf()`` style debugging. (See `Output
|
||||
Functions`_ for more information.) You can also use the ability to call
|
||||
back to application code at particular points in the program, passing a set
|
||||
of variable values to be logged or otherwise analyzed from there.
|
||||
One limitation of the current debugging support is that the debugger
|
||||
provides a window into an entire gang's worth of program instances, rather
|
||||
than just a single program instance. (These concepts will be introduced
|
||||
shortly, in `Basic Concepts: Program Instances and Gangs of Program
|
||||
Instances`). Thus, when a ``varying`` variable is printed, the values for
|
||||
each of the program instances are displayed. Along similar lines, the path
|
||||
the debugger follows through program source code passes each statement that
|
||||
any program instance wants to execute (see `Control Flow Within A Gang`_
|
||||
for more details on control flow in ``ispc``.)
|
||||
|
||||
While debugging, a variable, ``__mask``, is available to provide the
|
||||
current program execution mask at the current point in the program
|
||||
|
||||
Another option for debugging (and the only current option on Windows\*) is
|
||||
to use the ``print`` statement for ``printf()`` style debugging. (See
|
||||
`Output Functions`_ for more information.) You can also use the ability to
|
||||
call back to application code at particular points in the program, passing
|
||||
a set of variable values to be logged or otherwise analyzed from there.
|
||||
|
||||
|
||||
The ISPC Parallel Execution Model
|
||||
@@ -643,7 +756,7 @@ current processor, leading to excellent utilization of hardware SIMD units
|
||||
and high performance.
|
||||
|
||||
The number of program instances in a gang is relatively small; in practice,
|
||||
it's no more than twice the native SIMD width of the hardware it is
|
||||
it's no more than 2-4x the native SIMD width of the hardware it is
|
||||
executing on. (Thus, four or eight program instances in a gang on a CPU
|
||||
using the the 4-wide SSE instruction set, and eight or sixteen on a CPU
|
||||
using 8-wide AVX.)
|
||||
@@ -671,19 +784,19 @@ program instances in the gang: some of the currently running program
|
||||
instances want to execute the statements for the "true" case and some want
|
||||
to execute the statements for the "false" case.
|
||||
|
||||
Complex control flow in ``ispc`` programs generally "just works", computing
|
||||
the same results for each program instance in a gang as would have been
|
||||
computed if the equivalent code ran serially in C to compute each program
|
||||
instance's result individually. However, here we will more precisely
|
||||
define the execution model for control flow in order to be able to
|
||||
precisely define the language's behavior in specific situations.
|
||||
Complex control flow in ``ispc`` programs generally works as expected,
|
||||
computing the same results for each program instance in a gang as would
|
||||
have been computed if the equivalent code ran serially in C to compute each
|
||||
program instance's result individually. However, here we will more
|
||||
precisely define the execution model for control flow in order to be able
|
||||
to precisely define the language's behavior in specific situations.
|
||||
|
||||
We will specify the notion of a *program counter* and how it is updated to
|
||||
step through the program, and an *execution mask* that indicates which
|
||||
program instances want to execute the instruction at the current program
|
||||
counter. The program counter a single program counter shared by all of the
|
||||
program instances in the gang; it points to a single instruction to be
|
||||
executed next. The execution mask is a per-program instance boolean value
|
||||
executed next. The execution mask is a per-program-instance boolean value
|
||||
that indicates whether or not side effects from the current instruction
|
||||
should effect each program instance. Thus, for example, if a statement
|
||||
were to be executed with an "all off" mask, there should be no observable
|
||||
@@ -731,45 +844,22 @@ compiler output:
|
||||
bool test = (x < y);
|
||||
mask originalMask = get_current_mask();
|
||||
set_mask(originalMask & test);
|
||||
// true statements
|
||||
if (any_mask_entries_are_enabled()) {
|
||||
// true statements
|
||||
}
|
||||
set_mask(originalMask & ~test);
|
||||
// false statements
|
||||
if (any_mask_entries_are_enabled()) {
|
||||
// false statements
|
||||
}
|
||||
set_mask(originalMask);
|
||||
|
||||
In other words, the program counter steps through the statements for both
|
||||
the "true" case and the "false" case, with the execution mask set so that
|
||||
no side-effects from the true statements affect the program instances that
|
||||
want to run the false statements, and vice versa. the execution mask is
|
||||
then restored to the value it had before the ``if`` statement.
|
||||
|
||||
However, the compiler is free to generate different code for an ``if``
|
||||
test, such as:
|
||||
|
||||
::
|
||||
|
||||
float x = ..., y = ...;
|
||||
bool test = (x < y);
|
||||
mask originalMask = get_current_mask();
|
||||
if (all_off(originalMask & test))
|
||||
goto else_stmts;
|
||||
set_mask(originalMask & test);
|
||||
// true statements
|
||||
else_stmts:
|
||||
if (all_off(originalMask & ~test))
|
||||
goto done;
|
||||
set_mask(originalMask & ~test);
|
||||
// false statements
|
||||
done:
|
||||
set_mask(originalMask);
|
||||
|
||||
Furthermore, the order in which the program counter steps through the
|
||||
code for the "true" and "false" statements is undefined.
|
||||
|
||||
In most cases, there is no programmer-visible difference between these two
|
||||
ways of compiling ``if``, though see the `Uniform Variables and Varying
|
||||
Control Flow`_ section for a case where it causes undefined behavior in one
|
||||
particular situation.
|
||||
|
||||
want to run the false statements, and vice versa. However, a block of
|
||||
statements does not execute if the mask is "all off" upon entry to that
|
||||
block. The execution mask is then restored to the value it had before the
|
||||
``if`` statement.
|
||||
|
||||
Control Flow Example: Loops
|
||||
---------------------------
|
||||
@@ -883,8 +973,8 @@ It is an error to try to assign a ``varying`` value to a ``uniform``
|
||||
variable, though ``uniform`` values can be assigned to ``uniform``
|
||||
variables. Assignments to ``uniform`` variables are not affected by the
|
||||
execution mask (there's no unambiguous way that they could be); rather,
|
||||
they always apply if the program pointer executes a statement that is a
|
||||
uniform assignment.
|
||||
they always apply if the program counter pointer passes through a statement
|
||||
that is a ``uniform`` assignment.
|
||||
|
||||
|
||||
Uniform Control Flow
|
||||
@@ -956,11 +1046,10 @@ instances that are supposed to be executing the corresponding clause.
|
||||
Under this model, we must define the effect of modifying ``uniform``
|
||||
variables in the context of varying control flow.
|
||||
|
||||
In most cases, modifying ``uniform`` variables under varying control flow
|
||||
leads to the ``uniform`` variable having an undefined value, except within
|
||||
a block where the ``uniform`` value had a value assigned to it.
|
||||
|
||||
Consider the following example, which illustrates three cases.
|
||||
In general, modifying ``uniform`` variables under varying control flow
|
||||
leads to the ``uniform`` variable having a value that depends on whether
|
||||
any of the program instances in the gang followed a particular execution
|
||||
path. Consider the following example:
|
||||
|
||||
::
|
||||
|
||||
@@ -968,43 +1057,20 @@ Consider the following example, which illustrates three cases.
|
||||
uniform int b = 0;
|
||||
if (a == 0) {
|
||||
++b;
|
||||
// use b: undefined! May be 1 or 11.
|
||||
// b is 1
|
||||
}
|
||||
else {
|
||||
b = 10;
|
||||
// can use b, has value 10
|
||||
// b is 10
|
||||
}
|
||||
// b is undefined: may be 10 or 11
|
||||
// whether b is 1 or 10 depends on whether any of the values
|
||||
// of "a" in the executing gang were 0.
|
||||
|
||||
|
||||
There are three principles of ``ispc``'s execution model that have been
|
||||
previously introduced that together explain the results above. They are:
|
||||
|
||||
1. Modifications to ``uniform`` variables aren't affected by the
|
||||
execution mask.
|
||||
2. The "true" and "false" clauses of a varying ``if`` statement may be
|
||||
executed in either order.
|
||||
3. Varying ``if`` statements may in some cases execute the instructions
|
||||
for one of their clauses with the execution mask "all off".
|
||||
|
||||
Thus, within the "true" clause, the value of ``b`` is undefined since the
|
||||
"else" clause may or may not have executed before the clause for the true
|
||||
case.
|
||||
|
||||
Within the "else" clause, the assignment ``b = 10`` applies, giving ``b`` a
|
||||
well-defined value within the "else" clause and ``b`` can validly be used
|
||||
in the remainder of the code in that block.
|
||||
|
||||
Finally, ``b`` is undefined after the end of the "else" clause, since it is
|
||||
possible (but not necessarily the case) that one the clauses may have
|
||||
executed with an "all off" mask. Thus, even if ``a`` had a non-zero value
|
||||
for all program instances in the gang, it's possible that the "true" clause
|
||||
executed with an "all off" mask and ``b`` was modified there.
|
||||
|
||||
If it is important that code never be executed with an "all off" execution
|
||||
mask, then the ``cif`` statement (documented in the `"Coherent" Control Flow
|
||||
Statements: "cif" and Friends`_ section) can be used in place of a regular
|
||||
``if``, as it guarantees this property.
|
||||
Here, if any of the values of ``a`` across the gang was non-zero, then
|
||||
``b`` will have a value of 10 after the ``if`` statement has executed.
|
||||
However, if all of the values of ``a`` in the currently-executing program
|
||||
instances at the start of the ``if`` statement had a value of zero, then
|
||||
``b`` would have a value of 1.
|
||||
|
||||
|
||||
Data Races Within a Gang
|
||||
@@ -1191,6 +1257,10 @@ C++:
|
||||
|
||||
* Parallel ``foreach`` and ``foreach_tiled`` iteration constructs (see
|
||||
`Parallel Iteration Statements: "foreach" and "foreach_tiled"`_)
|
||||
* The ``foreach_active`` and ``foreach_unique`` iteration constructs, which
|
||||
provide ways of iterating over subsets of the program instances in the
|
||||
gang. See `Iteration over active program instances: "foreach_active"`_
|
||||
and `Iteration over unique elements: "foreach_unique"`_.)
|
||||
* Language support for task parallelism (see `Task Parallel Execution`_)
|
||||
* "Coherent" control flow statements that indicate that control flow is
|
||||
expected to be coherent across the running program instances (see
|
||||
@@ -1233,10 +1303,11 @@ The following reserved words from C89 are also reserved in ``ispc``:
|
||||
|
||||
``ispc`` additionally reserves the following words:
|
||||
|
||||
``bool``, ``export``, ``cdo``, ``cfor``, ``cif``, ``cwhile``, ``false``,
|
||||
``foreach``, ``foreach_tiled``, ``inline``, ``int8``, ``int16``, ``int32``,
|
||||
``int64``, ``launch``, ``print``, ``reference``, ``soa``, ``sync``,
|
||||
``task``, ``true``, ``uniform``, and ``varying``.
|
||||
``bool``, ``delete``, ``export``, ``cdo``, ``cfor``, ``cif``, ``cwhile``,
|
||||
``false``, ``foreach``, ``foreach_active``, ``foreach_tiled``,
|
||||
``foreach_unique``, ``in``, ``inline``, ``int8``, ``int16``, ``int32``,
|
||||
``int64``, ``launch``, ``new``, ``print``, ``soa``, ``sync``, ``task``,
|
||||
``true``, ``uniform``, and ``varying``.
|
||||
|
||||
|
||||
Lexical Structure
|
||||
@@ -1246,8 +1317,8 @@ Tokens in ``ispc`` are delimited by white-space and comments. The
|
||||
white-space characters are the usual set of spaces, tabs, and carriage
|
||||
returns/line feeds. Comments can be delineated with ``//``, which starts a
|
||||
comment that continues to the end of the line, or the start of a comment
|
||||
can be delineated with ``/*`` and the end with ``*/``. Like C/C++,
|
||||
comments can't be nested.
|
||||
can be delineated with ``/*`` at the start and with ``*/`` at the end.
|
||||
Like C/C++, comments can't be nested.
|
||||
|
||||
Identifiers in ``ispc`` are sequences of characters that start with an
|
||||
underscore or an upper-case or lower-case letter, and then followed by
|
||||
@@ -1306,7 +1377,7 @@ optional plus or minus sign and then digits from 0 to 9. For example:
|
||||
|
||||
Floating-point constants can optionally have a "f" or "F" suffix (``ispc``
|
||||
currently treats all floating-point constants as having 32-bit precision,
|
||||
making this suffix unnecessary.)
|
||||
making this suffix not currently have an effect.)
|
||||
|
||||
String constants in ``ispc`` are denoted by an opening double quote ``"``
|
||||
followed by any character other than a newline, up to a closing double
|
||||
@@ -1349,11 +1420,12 @@ The following identifiers are reserved as language keywords: ``bool``,
|
||||
``break``, ``case``, ``cdo``, ``cfor``, ``char``, ``cif``, ``cwhile``,
|
||||
``const``, ``continue``, ``default``, ``do``, ``double``, ``else``,
|
||||
``enum``, ``export``, ``extern``, ``false``, ``float``, ``for``,
|
||||
``foreach``, ``foreach_tiled``, ``goto``, ``if``, ``inline``, ``int``,
|
||||
``int8``, ``int16``, ``int32``, ``int64``, ``launch``, ``NULL``, ``print``,
|
||||
``return``, ``signed``, ``sizeof``, ``soa``, ``static``, ``struct``,
|
||||
``switch``, ``sync``, ``task``, ``true``, ``typedef``, ``uniform``,
|
||||
``union``, ``unsigned``, ``varying``, ``void``, ``volatile``, ``while``.
|
||||
``foreach``, ``foreach_active``, ``foreach_tiled``, ``foreach_unique``,
|
||||
``goto``, ``if``, ``in``, ``inline``, ``int``, ``int8``, ``int16``,
|
||||
``int32``, ``int64``, ``launch``, ``NULL``, ``print``, ``return``,
|
||||
``signed``, ``sizeof``, ``soa``, ``static``, ``struct``, ``switch``,
|
||||
``sync``, ``task``, ``true``, ``typedef``, ``uniform``, ``union``,
|
||||
``unsigned``, ``varying``, ``void``, ``volatile``, ``while``.
|
||||
|
||||
``ispc`` defines the following operators and punctuation:
|
||||
|
||||
@@ -2668,17 +2740,12 @@ same as ``if``:
|
||||
|
||||
``cif`` provides a hint to the compiler that you expect that most of the
|
||||
executing SPMD programs will all have the same result for the ``if``
|
||||
condition. Furthermore, it guarantees that the code in the "true" and
|
||||
"false" clauses of the ``if`` statement will never be executed with an "all
|
||||
off" execution mask. (See the `Control Flow Within A Gang`_ section for
|
||||
more details on why regular ``if`` statements may sometimes do this.)
|
||||
condition.
|
||||
|
||||
Along similar lines, ``cfor``, ``cdo``, and ``cwhile`` check to see if all
|
||||
program instances are running at the start of each loop iteration; if so,
|
||||
they can run a specialized code path that has been optimized for the "all
|
||||
on" execution mask case. It is already the case for the regular looping
|
||||
constructs in ``ispc`` that a loop will never be executed with an "all off"
|
||||
execution mask.
|
||||
on" execution mask case.
|
||||
|
||||
|
||||
Functions and Function Calls
|
||||
|
||||
Reference in New Issue
Block a user