Documentation updates for NEON

This commit is contained in:
Matt Pharr
2013-07-31 20:06:04 -07:00
parent d9c38b5c1f
commit 4f48d3258a
3 changed files with 56 additions and 17 deletions

View File

@@ -1,14 +1,16 @@
#!/bin/bash #!/bin/bash
rst2html=rst2html.py
for i in ispc perfguide faq; do for i in ispc perfguide faq; do
rst2html --template=template.txt --link-stylesheet \ $rst2html --template=template.txt --link-stylesheet \
--stylesheet-path=css/style.css $i.rst > $i.html --stylesheet-path=css/style.css $i.rst > $i.html
done done
rst2html --template=template-news.txt --link-stylesheet \ $rst2html --template=template-news.txt --link-stylesheet \
--stylesheet-path=css/style.css news.rst > news.html --stylesheet-path=css/style.css news.rst > news.html
rst2html --template=template-perf.txt --link-stylesheet \ $rst2html --template=template-perf.txt --link-stylesheet \
--stylesheet-path=css/style.css perf.rst > perf.html --stylesheet-path=css/style.css perf.rst > perf.html
#rst2latex --section-numbering --documentclass=article --documentoptions=DIV=9,10pt,letterpaper ispc.txt > ispc.tex #rst2latex --section-numbering --documentclass=article --documentoptions=DIV=9,10pt,letterpaper ispc.txt > ispc.tex

View File

@@ -467,31 +467,68 @@ There are three options that affect the compilation target: ``--arch``,
which sets the target architecture, ``--cpu``, which sets the target CPU, which sets the target architecture, ``--cpu``, which sets the target CPU,
and ``--target``, which sets the target instruction set. and ``--target``, which sets the target instruction set.
By default, the ``ispc`` compiler generates code for the 64-bit x86-64 If none of these options is specified, ``ispc`` generates code for the
architecture (i.e. ``--arch=x86-64``.) To compile to a 32-bit x86 target, architecture of the system the compiler is running on (i.e. 64-bit x86-64
supply ``--arch=x86`` on the command line: (``--arch=x86-64``) on x86 systems and ARM NEON on ARM systems.
To compile to a 32-bit x86 target, for example, supply ``--arch=x86`` on
the command line:
:: ::
ispc foo.ispc -o foo.obj --arch=x86 ispc foo.ispc -o foo.obj --arch=x86
No other architectures are currently supported. Currently-supported architectures are ``x86-64``, ``x86``, and ``arm``.
The target CPU determines both the default instruction set used as well as The target CPU determines both the default instruction set used as well as
which CPU architecture the code is tuned for. ``ispc --help`` provides a which CPU architecture the code is tuned for. ``ispc --help`` provides a
list of a number of the supported CPUs. By default, the CPU type of the list of all of the supported CPUs. By default, the CPU type of the system
system on which you're running ``ispc`` is used to determine the target on which you're running ``ispc`` is used to determine the target CPU.
CPU.
:: ::
ispc foo.ispc -o foo.obj --cpu=corei7-avx ispc foo.ispc -o foo.obj --cpu=corei7-avx
Finally, ``--target`` selects between the SSE2, SSE4, and AVX, and AVX2 Finally, ``--target`` selects the target instruction set. The following
targets are currently supported:
=========== ========= =======================================
Target Gang Size Description
----------- --------- ---------------------------------------
avx 8 AVX (2010-2011 era Intel CPUs)
avx-x2 16 "Double-pumped" AVX target, running
twice as many program instances as the
native vector width.
avx1.1 8 AVX 1.1 target (2012 era "Ivybridge"
Intel CPUs).
avx1.1-x2 16 Double-pumped AVX 1.1 target.
avx2 8 AVX 2 target (2013- Intel "Haswell"
CPUs.)
avx2-x2 16 Double-pumped AVX 2 target.
neon-8 16 ARM NEON target, targeting computation
on 8-bit data types.
neon-16 8 ARM NEON target, targeting computation
on 16-bit data types.
neon-32 4 ARM NEON target, targeting computation
on 32-bit data types.
sse2 4 SSE2 (early 2000s era x86 CPUs).
sse2-x2 8 Double-pumped SSE2.
sse4 4 SSE4 (generally 2008-2010 Intel CPUs).
sse4-x2 8 Double-pumped SSE4.
sse4-8 16 SSE4 target targeting computation on
8-bit data types.
sse4-16 8 SSE4 target targeting computation on
16-bit data types.
=========== ========= =======================================
See `Basic Concepts: Program Instances and Gangs of Program Instances`_ for
more discussion of the "gang size" and its implications for program
execution.
instruction sets. (As general context, SSE2 was first introduced in instruction sets. (As general context, SSE2 was first introduced in
processors that shipped in 2001, SSE4 was introduced in 2007, and processors that shipped in 2001, SSE4 was introduced in 2007, and
processors with AVX were introduced in 2010. AVX2 will be supported on processors with AVX were introduced in 2010, and AVX2 arrived in 2013.
future CPUs based on Intel's "Haswell" architecture. Consult your CPU's Consult your CPU's
manual for specifics on which vector instruction set it supports.) manual for specifics on which vector instruction set it supports.)
By default, the target instruction set is chosen based on the most capable By default, the target instruction set is chosen based on the most capable
@@ -505,7 +542,7 @@ Generating Generic C++ Output
----------------------------- -----------------------------
In addition to generating object files or assembly output for specific In addition to generating object files or assembly output for specific
targets like SSE2, SSE4, and AVX, ``ispc`` provides an option to generate targets like NEON, SSE2, SSE4, and AVX, ``ispc`` provides an option to generate
"generic" C++ output. This "generic" C++ output. This
As an example, consider the following simple ``ispc`` program: As an example, consider the following simple ``ispc`` program:
@@ -659,7 +696,7 @@ preprocessor runs:
* - ISPC * - ISPC
- 1 - 1
- Detecting that the ``ispc`` compiler is processing the file - Detecting that the ``ispc`` compiler is processing the file
* - ISPC_TARGET_{SSE2,SSE4,AVX,AVX2} * - ISPC_TARGET_{NEON_8,NEON_16,NEON_32,SSE2,SSE4,AVX,AVX11,AVX2,GENERIC}
- 1 - 1
- One of these will be set, depending on the compilation target. - One of these will be set, depending on the compilation target.
* - ISPC_POINTER_SIZE * - ISPC_POINTER_SIZE

View File

@@ -558,8 +558,8 @@ Target::SupportedTargetArchs() {
const char * const char *
Target::SupportedTargetISAs() { Target::SupportedTargetISAs() {
return "neon, sse2, sse2-x2, sse4, sse4-8, sse4-16, sse4-x2, " return "neon-8, neon-16, neon-32, sse2, sse2-x2, sse4, sse4-8, sse4-16, sse4-x2, "
"avx, avx-x2, avx1.1, avx1.1-x2, avx2, avx2-x2," "avx, avx-x2, avx1.1, avx1.1-x2, avx2, avx2-x2, "
"generic-1, generic-4, generic-8, generic-16, generic-32"; "generic-1, generic-4, generic-8, generic-16, generic-32";
} }