Add "double-wide" sse2-x2 target.

i.e. run 8 program instances together, along the lines of the double-pumped
sse4-x2 target.
This commit is contained in:
Matt Pharr
2011-10-11 15:17:31 -07:00
parent 1198520029
commit 286c23426e
14 changed files with 1543 additions and 806 deletions

View File

@@ -3213,9 +3213,10 @@ instances. For other workloads, it may lead to a slowdown due to higher
register pressure; trying both approaches for key kernels may be
worthwhile.
This option is currently only available for the SSE4 and AVX targets, and
is selected with the ``--target=sse4-x2`` and ``--target=avx-x2`` options,
respectively.
This option is only available for each of the SSE2, SSE4 and AVX targets.
It is selected with the ``--target=sse2-x2``, ``--target=sse4-x2`` and
``--target=avx-x2`` options, respectively.
Compiling With Support For Multiple Instruction Sets
----------------------------------------------------