Fix numerous typos in documentation (goodness)

This commit is contained in:
Matt Pharr
2011-12-14 10:26:35 -08:00
parent 533b539780
commit 4adf527a4d
3 changed files with 48 additions and 48 deletions

View File

@@ -273,10 +273,10 @@ Then four object files will be generated: ``foo_sse2.o``, ``foo_sse4.o``,
``foo_avx.o``, and ``foo.o``.[#]_ Link all of these into your executable, and
when you call a function in ``foo.ispc`` from your application code,
``ispc`` will determine which instruction sets are supported by the CPU the
code is running on and will call the most appropraite version of the
code is running on and will call the most appropriate version of the
function available.
.. [#] Similarly, if you choose to generate assembly langauage output or
.. [#] Similarly, if you choose to generate assembly language output or
LLVM bitcode output, multiple versions of those files will be created.
In general, the version of the function that runs will be the one in the

View File

@@ -26,9 +26,9 @@ The main goals behind ``ispc`` are to:
units without the extremely low-programmer-productivity activity of directly
writing intrinsics.
* Explore opportunities from close-coupling between C/C++ application code
and SPMD ``ispc`` code running on the same processor--lightweight funcion
calls betwen the two languages, sharing data directly via pointers without
copying or reformating, etc.
and SPMD ``ispc`` code running on the same processor--lightweight function
calls between the two languages, sharing data directly via pointers without
copying or reformatting, etc.
**We are very interested in your feedback and comments about ispc and
in hearing your experiences using the system. We are especially interested
@@ -249,7 +249,7 @@ of the value.
The first thing to notice in this program is the presence of the ``export``
keyword in the function definition; this indicates that the function should
be made available to be called from application code. The ``uniform``
qualifiers on the parameters to ``simple`` indicate that the correpsonding
qualifiers on the parameters to ``simple`` indicate that the corresponding
variables are non-vector quantities--this concept is discussed in detail in the
`"uniform" and "varying" Qualifiers`_ section.
@@ -321,7 +321,7 @@ When the executable ``simple`` runs, it generates the expected output:
...
For a slightly more complex example of using ``ispc``, see the `Mandelbrot
set example`_ page on the ``ispc`` website for a walkthrough of an ``ispc``
set example`_ page on the ``ispc`` website for a walk-through of an ``ispc``
implementation of that algorithm. After reading through that example, you
may want to examine the source code of the various examples in the
``examples/`` directory of the ``ispc`` distribution.
@@ -372,7 +372,7 @@ Optimizations are on by default; they can be turned off with ``-O0``:
On Mac\* and Linux\*, there is basic support for generating debugging
symbols; this is enabled with the ``-g`` command-line flag. Using ``-g``
causes optimizations to be disabled; to compile with debugging symbols and
optimizaion, ``-O1`` should be provided as well as the ``-g`` flag.
optimization, ``-O1`` should be provided as well as the ``-g`` flag.
The ``-h`` flag can also be used to direct ``ispc`` to generate a C/C++
header file that includes C/C++ declarations of the C-callable ``ispc``
@@ -610,7 +610,7 @@ side-effects.
Upon entry to an ``ispc`` function called by the application, the execution
mask is "all on" and the program counter points at the first statement in
the function. The following two statments describe the required behavior
the function. The following two statements describe the required behavior
of the program counter and the execution mask over the course of execution
of an ``ispc`` function.
@@ -731,7 +731,7 @@ program instances is *maximally converged*. Maximal convergence means that
if two program instances follow the same control path, they are guaranteed
to execute each program statement concurrently. If two program instances
follow diverging control paths, it is guaranteed that they will reconverge
as soon as possible (if they do later reconverge). [#]_
as soon as possible in the function (if they do later reconverge). [#]_
.. [#] This is another significant difference between the ``ispc``
execution model and the one implemented by OpenCL* and CUDA*, which
@@ -819,7 +819,7 @@ of control flow, will say that control flow based on ``varying``
expressions is "varying" control flow.)
Consider for example an image filtering operation where the program loops
over pixels adjacent to the given (x,y) coordiantes:
over pixels adjacent to the given (x,y) coordinates:
::
@@ -919,7 +919,7 @@ for all program instances in the gang, it's possible that the "true" clause
executed with an "all off" mask and ``b`` was modified there.
If it is important that code never be executed with an "all off" execution
mask, then the ``cif`` statment (documented in the `"Coherent" Control Flow
mask, then the ``cif`` statement (documented in the `"Coherent" Control Flow
Statements: "cif" and Friends`_ section) can be used in place of a regular
``if``, as it guarantees this property.
@@ -1045,7 +1045,7 @@ completed.
The ISPC Language
=================
``ispc`` is an extended verion of the C programming language, providing a
``ispc`` is an extended version of the C programming language, providing a
number of new features that make it easy to write high-performance SPMD
programs for the CPU. Note that between not only the few small syntactic
differences between ``ispc`` and C code but more importantly ``ispc``'s
@@ -1066,12 +1066,12 @@ This subsection summarizes the differences between ``ispc`` and C; if you
are already familiar with C, you may find it most effective to focus on
this subsection and just focus on the topics in the remainder of section
that introduce new language features. You may also find it helpful to
comapre the ``ispc`` and C++ implementations of various algorithms in the
compare the ``ispc`` and C++ implementations of various algorithms in the
``ispc`` ``examples/`` directory to get a sense of the close relationship
between ``ispc`` and C.
Specifically, C89 is used as the baseline for comparison in this subsection
(this is also the verion of C described in the Second Edition of Kernighan
(this is also the version of C described in the Second Edition of Kernighan
and Ritchie's book). (``ispc`` adopts some features from C99 and from C++,
which will be highlighted in the below.)
@@ -1099,7 +1099,7 @@ in C:
statement itself (e.g. ``for (int i = 0; ...``)
* The ``inline`` qualifier to indicate that a function should be inlined
* Function overloading by parameter type
* Hexidecimal floating-point constants
* Hexadecimal floating-point constants
``ispc`` also adds a number of new features that aren't in C89, C99, or
C++:
@@ -1158,11 +1158,11 @@ The following reserved words from C89 are also reserved in ``ispc``:
Lexical Structure
-----------------
Tokens in ``ispc`` are delimted by white-space and comments. The
Tokens in ``ispc`` are delimited by white-space and comments. The
white-space characters are the usual set of spaces, tabs, and carriage
returns/line feeds. Comments can be delinated with ``//``, which starts a
returns/line feeds. Comments can be delineated with ``//``, which starts a
comment that continues to the end of the line, or the start of a comment
can be delinated with ``/*`` and the end with ``*/``. Like C/C++,
can be delineated with ``/*`` and the end with ``*/``. Like C/C++,
comments can't be nested.
Identifiers in ``ispc`` are sequences of characters that start with an
@@ -1170,9 +1170,9 @@ underscore or an upper-case or lower-case letter, and then followed by
zero or more letters, numbers, or underscores. Identifiers that start with
two underscores are reserved for use by the compiler.
Integer numeric constants can be specified in base 10, hexidecimal, or
Integer numeric constants can be specified in base 10, hexadecimal, or
binary. (Octal integer constants aren't supported). Base 10 constants are
given by a sequence of one or more digits from 0 to 9. Hexidecimal
given by a sequence of one or more digits from 0 to 9. Hexadecimal
constants are denoted by a leading ``0x`` and then one or more digits from
0-9, a-f, or A-F. Finally, binary constants are denoted by a leading
``0b`` and then a sequence of 1s and 0s.
@@ -1194,11 +1194,11 @@ The second option is scientific notation, where a base value is specified
as the first form of a floating-point constant but is then followed by an
"e" or "E", then a plus sign or a minus sign, and then an exponent.
Finally, floating-point constants may be specified as hexidecimal
Finally, floating-point constants may be specified as hexadecimal
constants; this form can ensure a perfectly bit-accurate representation of
a particular floating-point number. These are specified with an "0x"
prefix, followed by a zero or a one, a period, and then the remainder of
the mantissa in hexidecimal form, with digits from 0-9, a-f, or A-F. The
the mantissa in hexadecimal form, with digits from 0-9, a-f, or A-F. The
start of the exponent is denoted by a "p", which is then followed by an
optional plus or minus sign and then digits from 0 to 9. For example:
@@ -1235,7 +1235,7 @@ to specify special characters. These sequences all start with an initial
* - ``\n``
- newline
* - ``\r``
- carriabe return
- carriage return
* - ``\t``
- horizontal tab
* - ``\v``
@@ -1243,7 +1243,7 @@ to specify special characters. These sequences all start with an initial
* - ``\`` followed by one or more digits from 0-8
- ASCII character in octal notation
* - ``\x``, followed by one or more digits from 0-9, a-f, A-F
- ASCII character in hexidecimal notation
- ASCII character in hexadecimal notation
``ispc`` doesn't support a string data type; string constants can be passed
as the first argument to the ``print()`` statement, however. ``ispc`` also
@@ -1398,7 +1398,7 @@ store are:
uniform float bar[10];
The first declaration corresponds to 10 gang-wide ``float`` values in
memory, while the second declaration corresonds to 10 ``float`` values.
memory, while the second declaration corresponds to 10 ``float`` values.
Defining New Names For Types
@@ -1562,7 +1562,7 @@ instance in the gang has its own unique pointer value)
(The rationale for this limitation is that references must be represented
as either a uniform pointer or a varying pointer internally. While
choosing a varying pointer would provide maximum flexibilty and eliminate
choosing a varying pointer would provide maximum flexibility and eliminate
this restriction, it would reduce performance in the common case where a
uniform pointer is all that's needed. As a work-around, a varying pointer
can be used in cases where a varying lvalue reference would be desired.)
@@ -1585,7 +1585,7 @@ and then a brace-delimited list of enumerators with optional values:
Each ``enum`` declaration defines a new type; an attempt to implicitly
convert between enumerations of different types gives a compile-time error,
but enuemrations of different types can be explicitly cast to one other.
but enumerations of different types can be explicitly cast to one other.
::
@@ -1595,7 +1595,7 @@ Enumerators are implicitly converted to integer types, however, so they can
be directly passed to routines that take integer parameters and can be used
in expressions including integers, for example. However, the integer
result of such an expression must be explicitly cast back to the enumerant
type if it to be assigned to a variable with the enuemrant type.
type if it to be assigned to a variable with the enumerant type.
::
@@ -1846,7 +1846,7 @@ Structures can also be initialized by providing element values in braces:
....
Color d = { 0.5, .75, 1.0 }; // r = 0.5, ...
Arrays of structures and arrays inside structures can be initialzed with
Arrays of structures and arrays inside structures can be initialized with
the expected syntax:
::
@@ -1880,7 +1880,7 @@ Structure member access and array indexing also work as in C.
return foo.f[4] - foo.i;
The address-of operator, pointer derefernce operator, and pointer member
The address-of operator, pointer dereference operator, and pointer member
operator also work as expected.
::
@@ -1925,7 +1925,7 @@ Basic Iteration Statements: "for", "while", and "do"
``ispc`` supports ``for``, ``while``, and ``do`` loops, with the same
specification as in C. Like C++, variables can be declared in the ``for``
statment itself:
statement itself:
::
@@ -2009,7 +2009,7 @@ nested inside a ``foreach`` loop.) ``continue`` statements are legal in
a program instances that executes a ``continue`` statement effectively
skips over the rest of the loop body for the current iteration.
As a specific example, consdier the following ``foreach`` statement:
As a specific example, consider the following ``foreach`` statement:
::
@@ -2107,7 +2107,7 @@ some computation on an array of data.
}
Here, we've written a loop that explicitly loops over the data in chunks of
``programCount`` elements. In each loop iteraton, the running program
``programCount`` elements. In each loop iteration, the running program
instances effectively collude amongst themselves using ``programIndex`` to
determine which elements to work on in a way that ensures that all of the
data elements will be processed. In this particular case, a ``foreach``
@@ -2313,7 +2313,7 @@ distributions.
If you are implementing your own task system, the remainder of this section
discusses the requirements for these calls. You will also likely want to
review the example task systems in ``examples/tasksys.cpp`` for reference.
If you are not implmenting your own task system, you can skip reading the
If you are not implementing your own task system, you can skip reading the
remainder of this section.
Here are the declarations of the three functions that must be provided to
@@ -2333,7 +2333,7 @@ implementation can efficiently wait for completion on just the tasks
launched from a single function.
The first time one of ``ISPCLaunch()`` or ``ISPCAlloc()`` is called in an
``ispc`` functon, the ``void *`` pointed to by the ``handlePtr`` parameter
``ispc`` function, the ``void *`` pointed to by the ``handlePtr`` parameter
will be ``NULL``. The implementations of these function should then
initialize ``*handlePtr`` to a unique handle value of some sort. (For
example, it might allocate a small structure to record which tasks were
@@ -2349,14 +2349,14 @@ than a pointer to it, as in the other functions.
The ``ISPCAlloc()`` function is used to allocate small blocks of memory to
store parameters passed to tasks. It should return a pointer to memory
with the given aize and alignment. Note that there is no explicit
with the given size and alignment. Note that there is no explicit
``ISPCFree()`` call; instead, all memory allocated within an ``ispc``
function should be freed when ``ISPCSync()`` is called.
``ISPCLaunch()`` is called to launch to launch one or more asynchronous
tasks. Each ``launch`` statement in ``ispc`` code causes a call to
``ISPCLaunch()`` to be emitted in the generated code. The three parameters
after the handle pointer to thie function are relatively straightforward;
after the handle pointer to the function are relatively straightforward;
the ``void *f`` parameter holds a pointer to a function to call to run the
work for this task, ``data`` holds a pointer to data to pass to this
function, and ``count`` is the number of instances of this function to
@@ -2371,7 +2371,7 @@ The signature of the provided function pointer ``f`` is
int taskIndex, int taskCount)
When this function pointer is called by one of the hardware threads managed
bythe task system, the ``data`` pointer passed to ``ISPCLaunch()`` should
by the task system, the ``data`` pointer passed to ``ISPCLaunch()`` should
be passed to it for its first parameter; ``threadCount`` gives the total
number of hardware threads that have been spawned to run tasks and
``threadIndex`` should be an integer index between zero and ``threadCount``
@@ -2690,7 +2690,7 @@ generates the following output on a four-wide compilation target:
When a varying variable is printed, the values for program instances that
aren't currently executing are printed inside double parenthesis,
indicating inactive program instances. The elements for inactive program
instances may have garabge values, though in some circumstances it can be
instances may have garbage values, though in some circumstances it can be
useful to see their values.
Assertions
@@ -2910,7 +2910,7 @@ If called when none of the program instances are running,
There are also a number of functions to compute "scan"s of values across
the program instances. For example, the ``exclusive_scan_and()`` function
computes, for each program instance, the sum of the given value over all of
the preceeding program instances. (The scans currently available in
the preceding program instances. (The scans currently available in
``ispc`` are all so-called "exclusive" scans, meaning that the value
computed for a given element does not include the value provided for that
element.) In C code, an exclusive add scan over an array might be
@@ -3206,7 +3206,7 @@ rather than one per program instance.
uniform int32 newval)
Be careful that you use the atomic function that you mean to; consider the
folloiwng code:
following code:
::
@@ -3563,7 +3563,7 @@ Restructuring Existing Programs to Use ISPC
``ispc`` is designed to enable you to incorporate
SPMD parallelism into existing code with minimal modification; features
like the ability to share memory and data structures betwen C/C++ and
like the ability to share memory and data structures between C/C++ and
``ispc`` code and the ability to directly call back and forth between
``ispc`` and C/C++ are motivated by this. These features also make it
easy to incrementally transform a program to use ``ispc``; the most

View File

@@ -64,7 +64,7 @@ on each one:
Depending on the specifics of the computation being performed, the code
generated for this function could likely be improved by modifying the code
so that the loop only goes as far through the data as is possible to pack
an entire gang of program instances with computation each time thorugh the
an entire gang of program instances with computation each time through the
loop. Doing so enables the ``ispc`` compiler to generate more efficient
code for cases where it knows that the execution mask is "all on". Then,
an ``if`` statement at the end handles processing the ragged extra bits of
@@ -153,7 +153,7 @@ processed, and so forth.
Performance benefit can come from using ``foreach_tiled`` in that it
essentially optimizes for the benefit of iterating over *compact* regions
of the domian (while ``foreach`` iterates over the domain in a way that
of the domain (while ``foreach`` iterates over the domain in a way that
generally allows linear memory access.) There are two benefits from
processing compact regions of the domain.
@@ -215,7 +215,7 @@ Use "uniform" Whenever Appropriate
----------------------------------
For any variable that will always have the same value across all of the
program instances in a gang, declare the variable with the ``unfiorm``
program instances in a gang, declare the variable with the ``uniform``
qualifier. Doing so enables the ``ispc`` compiler to emit better code in
many different ways.
@@ -229,7 +229,7 @@ number of iterations:
If this is written with ``i`` as a ``varying`` variable, as above, there's
additional overhead in the code generated for the loop as the compiler
emits instructions to handle the possibilty of not all program instances
emits instructions to handle the possibility of not all program instances
following the same control flow path (as might be the case if the loop
limit, 10, was itself a ``varying`` value.)
@@ -568,7 +568,7 @@ mask of all lanes currently executing (assuming a four-wide gang size
target machine).
For a fuller example of the utility of this functionality, see
``examples/aobench_instrumented`` in the ``ispc`` distribution. Ths
``examples/aobench_instrumented`` in the ``ispc`` distribution. This
example includes an implementation of the ``ISPCInstrument()`` function
that collects aggregate data about the program's execution behavior.