Implement new naming scheme for --target.

Now targets are named like "<isa>-i<mask size>x<gang size>", e.g.
"sse4-i8x16", or "avx2-i32x16".

The old target names are still supported.
This commit is contained in:
Matt Pharr
2013-08-08 19:23:44 -07:00
parent 1d76f74b16
commit 0c5742b6f8
7 changed files with 163 additions and 100 deletions

View File

@@ -489,54 +489,72 @@ on which you're running ``ispc`` is used to determine the target CPU.
ispc foo.ispc -o foo.obj --cpu=corei7-avx
Finally, ``--target`` selects the target instruction set. The following
targets are currently supported:
Finally, ``--target`` selects the target instruction set. The target
string is of the form ``[ISA]-i[mask size]x[gang size]``. For example,
``--target=avx2-i32x16`` specifies a target with the AVX2 instruction set,
a mask size of 32 bits, and a gang size of 16.
=========== ========= =======================================
Target Gang Size Description
----------- --------- ---------------------------------------
avx 8 AVX (2010-2011 era Intel CPUs)
avx-x2 16 "Double-pumped" AVX target, running
twice as many program instances as the
native vector width.
avx1.1 8 AVX 1.1 target (2012 era "Ivybridge"
Intel CPUs).
avx1.1-x2 16 Double-pumped AVX 1.1 target.
avx2 8 AVX 2 target (2013- Intel "Haswell"
CPUs.)
avx2-x2 16 Double-pumped AVX 2 target.
neon-8 16 ARM NEON target, targeting computation
on 8-bit data types.
neon-16 8 ARM NEON target, targeting computation
on 16-bit data types.
neon-32 4 ARM NEON target, targeting computation
on 32-bit data types.
sse2 4 SSE2 (early 2000s era x86 CPUs).
sse2-x2 8 Double-pumped SSE2.
sse4 4 SSE4 (generally 2008-2010 Intel CPUs).
sse4-x2 8 Double-pumped SSE4.
sse4-8 16 SSE4 target targeting computation on
8-bit data types.
sse4-16 8 SSE4 target targeting computation on
16-bit data types.
=========== ========= =======================================
The following target ISAs are supported:
============ ==========================================
Target Description
------------ ------------------------------------------
avx, avx1 AVX (2010-2011 era Intel CPUs)
avx1.1 AVX 1.1 (2012 era "Ivybridge" Intel CPUs)
avx2 AVX 2 target (2013- Intel "Haswell" CPUs)
neon ARM NEON
sse2 SSE2 (early 2000s era x86 CPUs)
sse4 SSE4 (generally 2008-2010 Intel CPUs)
============ ==========================================
Consult your CPU's manual for specifics on which vector instruction set it
supports.
The mask size may be 8, 16, or 32 bits, though not all combinations of ISAs
and mask sizes are supported. For best performance, the best general
approach is to choose a mask size equal to the size of the most common
datatype in your programs. For example, if most of your computation is on
32-bit floating-point values, an ``i32`` target is appropriate. However,
if you're mostly doing computation on 8-bit images, ``i8`` is a better choice.
See `Basic Concepts: Program Instances and Gangs of Program Instances`_ for
more discussion of the "gang size" and its implications for program
execution.
instruction sets. (As general context, SSE2 was first introduced in
processors that shipped in 2001, SSE4 was introduced in 2007, and
processors with AVX were introduced in 2010, and AVX2 arrived in 2013.
Consult your CPU's
manual for specifics on which vector instruction set it supports.)
Running ``ispc --help`` and looking at the output for the ``--target``
option gives the most up-to-date documentation about which targets your
compiler binary supports.
The naming scheme for compilation targets changed in August 2013; the
following table shows the relationship between names in the old scheme and
in the new scheme:
============= ===========
Target Former Name
------------- -----------
avx1-i32x8 avx, avx1
avx1-i32x16 avx-x2
avx1.1-i32x8 avx1.1
avx1.1-i32x16 avx1.1-x2
avx2-i32x8 avx2
avx2-i32x16 avx2-x2
neon-8 n/a
neon-16 n/a
neon-32 n/a
sse2-i32x4 sse2
sse2-i32x8 sse2-x2
sse4-i32x4 sse4
sse4-i32x8 sse4-x2
sse4-i8x16 n/a
sse4-i16x8 n/a
============= ===========
By default, the target instruction set is chosen based on the most capable
one supported by the system on which you're running ``ispc``. You can
override this choice with the ``--target`` flag; for example, to select
Intel® SSE2, use ``--target=sse2``. (As with the other options in this
section, see the output of ``ispc --help`` for a full list of supported
targets.)
Intel® SSE2 with a 32-bit mask and 4 program instances in a gang, use
``--target=sse2-i32x4``. (As with the other options in this section, see
the output of ``ispc --help`` for a full list of supported targets.)
Generating Generic C++ Output
-----------------------------