Documentation updates for NEON
This commit is contained in:
@@ -1,14 +1,16 @@
|
|||||||
#!/bin/bash
|
#!/bin/bash
|
||||||
|
|
||||||
|
rst2html=rst2html.py
|
||||||
|
|
||||||
for i in ispc perfguide faq; do
|
for i in ispc perfguide faq; do
|
||||||
rst2html --template=template.txt --link-stylesheet \
|
$rst2html --template=template.txt --link-stylesheet \
|
||||||
--stylesheet-path=css/style.css $i.rst > $i.html
|
--stylesheet-path=css/style.css $i.rst > $i.html
|
||||||
done
|
done
|
||||||
|
|
||||||
rst2html --template=template-news.txt --link-stylesheet \
|
$rst2html --template=template-news.txt --link-stylesheet \
|
||||||
--stylesheet-path=css/style.css news.rst > news.html
|
--stylesheet-path=css/style.css news.rst > news.html
|
||||||
|
|
||||||
rst2html --template=template-perf.txt --link-stylesheet \
|
$rst2html --template=template-perf.txt --link-stylesheet \
|
||||||
--stylesheet-path=css/style.css perf.rst > perf.html
|
--stylesheet-path=css/style.css perf.rst > perf.html
|
||||||
|
|
||||||
#rst2latex --section-numbering --documentclass=article --documentoptions=DIV=9,10pt,letterpaper ispc.txt > ispc.tex
|
#rst2latex --section-numbering --documentclass=article --documentoptions=DIV=9,10pt,letterpaper ispc.txt > ispc.tex
|
||||||
|
|||||||
@@ -467,31 +467,68 @@ There are three options that affect the compilation target: ``--arch``,
|
|||||||
which sets the target architecture, ``--cpu``, which sets the target CPU,
|
which sets the target architecture, ``--cpu``, which sets the target CPU,
|
||||||
and ``--target``, which sets the target instruction set.
|
and ``--target``, which sets the target instruction set.
|
||||||
|
|
||||||
By default, the ``ispc`` compiler generates code for the 64-bit x86-64
|
If none of these options is specified, ``ispc`` generates code for the
|
||||||
architecture (i.e. ``--arch=x86-64``.) To compile to a 32-bit x86 target,
|
architecture of the system the compiler is running on (i.e. 64-bit x86-64
|
||||||
supply ``--arch=x86`` on the command line:
|
(``--arch=x86-64``) on x86 systems and ARM NEON on ARM systems.
|
||||||
|
|
||||||
|
To compile to a 32-bit x86 target, for example, supply ``--arch=x86`` on
|
||||||
|
the command line:
|
||||||
|
|
||||||
::
|
::
|
||||||
|
|
||||||
ispc foo.ispc -o foo.obj --arch=x86
|
ispc foo.ispc -o foo.obj --arch=x86
|
||||||
|
|
||||||
No other architectures are currently supported.
|
Currently-supported architectures are ``x86-64``, ``x86``, and ``arm``.
|
||||||
|
|
||||||
The target CPU determines both the default instruction set used as well as
|
The target CPU determines both the default instruction set used as well as
|
||||||
which CPU architecture the code is tuned for. ``ispc --help`` provides a
|
which CPU architecture the code is tuned for. ``ispc --help`` provides a
|
||||||
list of a number of the supported CPUs. By default, the CPU type of the
|
list of all of the supported CPUs. By default, the CPU type of the system
|
||||||
system on which you're running ``ispc`` is used to determine the target
|
on which you're running ``ispc`` is used to determine the target CPU.
|
||||||
CPU.
|
|
||||||
|
|
||||||
::
|
::
|
||||||
|
|
||||||
ispc foo.ispc -o foo.obj --cpu=corei7-avx
|
ispc foo.ispc -o foo.obj --cpu=corei7-avx
|
||||||
|
|
||||||
Finally, ``--target`` selects between the SSE2, SSE4, and AVX, and AVX2
|
Finally, ``--target`` selects the target instruction set. The following
|
||||||
|
targets are currently supported:
|
||||||
|
|
||||||
|
=========== ========= =======================================
|
||||||
|
Target Gang Size Description
|
||||||
|
----------- --------- ---------------------------------------
|
||||||
|
avx 8 AVX (2010-2011 era Intel CPUs)
|
||||||
|
avx-x2 16 "Double-pumped" AVX target, running
|
||||||
|
twice as many program instances as the
|
||||||
|
native vector width.
|
||||||
|
avx1.1 8 AVX 1.1 target (2012 era "Ivybridge"
|
||||||
|
Intel CPUs).
|
||||||
|
avx1.1-x2 16 Double-pumped AVX 1.1 target.
|
||||||
|
avx2 8 AVX 2 target (2013- Intel "Haswell"
|
||||||
|
CPUs.)
|
||||||
|
avx2-x2 16 Double-pumped AVX 2 target.
|
||||||
|
neon-8 16 ARM NEON target, targeting computation
|
||||||
|
on 8-bit data types.
|
||||||
|
neon-16 8 ARM NEON target, targeting computation
|
||||||
|
on 16-bit data types.
|
||||||
|
neon-32 4 ARM NEON target, targeting computation
|
||||||
|
on 32-bit data types.
|
||||||
|
sse2 4 SSE2 (early 2000s era x86 CPUs).
|
||||||
|
sse2-x2 8 Double-pumped SSE2.
|
||||||
|
sse4 4 SSE4 (generally 2008-2010 Intel CPUs).
|
||||||
|
sse4-x2 8 Double-pumped SSE4.
|
||||||
|
sse4-8 16 SSE4 target targeting computation on
|
||||||
|
8-bit data types.
|
||||||
|
sse4-16 8 SSE4 target targeting computation on
|
||||||
|
16-bit data types.
|
||||||
|
=========== ========= =======================================
|
||||||
|
|
||||||
|
See `Basic Concepts: Program Instances and Gangs of Program Instances`_ for
|
||||||
|
more discussion of the "gang size" and its implications for program
|
||||||
|
execution.
|
||||||
|
|
||||||
instruction sets. (As general context, SSE2 was first introduced in
|
instruction sets. (As general context, SSE2 was first introduced in
|
||||||
processors that shipped in 2001, SSE4 was introduced in 2007, and
|
processors that shipped in 2001, SSE4 was introduced in 2007, and
|
||||||
processors with AVX were introduced in 2010. AVX2 will be supported on
|
processors with AVX were introduced in 2010, and AVX2 arrived in 2013.
|
||||||
future CPUs based on Intel's "Haswell" architecture. Consult your CPU's
|
Consult your CPU's
|
||||||
manual for specifics on which vector instruction set it supports.)
|
manual for specifics on which vector instruction set it supports.)
|
||||||
|
|
||||||
By default, the target instruction set is chosen based on the most capable
|
By default, the target instruction set is chosen based on the most capable
|
||||||
@@ -505,7 +542,7 @@ Generating Generic C++ Output
|
|||||||
-----------------------------
|
-----------------------------
|
||||||
|
|
||||||
In addition to generating object files or assembly output for specific
|
In addition to generating object files or assembly output for specific
|
||||||
targets like SSE2, SSE4, and AVX, ``ispc`` provides an option to generate
|
targets like NEON, SSE2, SSE4, and AVX, ``ispc`` provides an option to generate
|
||||||
"generic" C++ output. This
|
"generic" C++ output. This
|
||||||
|
|
||||||
As an example, consider the following simple ``ispc`` program:
|
As an example, consider the following simple ``ispc`` program:
|
||||||
@@ -659,7 +696,7 @@ preprocessor runs:
|
|||||||
* - ISPC
|
* - ISPC
|
||||||
- 1
|
- 1
|
||||||
- Detecting that the ``ispc`` compiler is processing the file
|
- Detecting that the ``ispc`` compiler is processing the file
|
||||||
* - ISPC_TARGET_{SSE2,SSE4,AVX,AVX2}
|
* - ISPC_TARGET_{NEON_8,NEON_16,NEON_32,SSE2,SSE4,AVX,AVX11,AVX2,GENERIC}
|
||||||
- 1
|
- 1
|
||||||
- One of these will be set, depending on the compilation target.
|
- One of these will be set, depending on the compilation target.
|
||||||
* - ISPC_POINTER_SIZE
|
* - ISPC_POINTER_SIZE
|
||||||
|
|||||||
4
ispc.cpp
4
ispc.cpp
@@ -558,8 +558,8 @@ Target::SupportedTargetArchs() {
|
|||||||
|
|
||||||
const char *
|
const char *
|
||||||
Target::SupportedTargetISAs() {
|
Target::SupportedTargetISAs() {
|
||||||
return "neon, sse2, sse2-x2, sse4, sse4-8, sse4-16, sse4-x2, "
|
return "neon-8, neon-16, neon-32, sse2, sse2-x2, sse4, sse4-8, sse4-16, sse4-x2, "
|
||||||
"avx, avx-x2, avx1.1, avx1.1-x2, avx2, avx2-x2,"
|
"avx, avx-x2, avx1.1, avx1.1-x2, avx2, avx2-x2, "
|
||||||
"generic-1, generic-4, generic-8, generic-16, generic-32";
|
"generic-1, generic-4, generic-8, generic-16, generic-32";
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|||||||
Reference in New Issue
Block a user