6dbb15027af1d347bd0b812eed871d766f475a0f
When loading from an address that's computed by adding two registers
together, x86 can scale one of them by 2, 4, or 8, for free as part
of the addressing calculation. This change makes the code generated
for gather and scatter use this.
For the cases where gather/scatter is based on a base pointer and
an integer offset vector, the GSImprovementsPass looks to see if the
integer offsets are being computed as 2/4/8 times some other value.
If so, it extracts the 2x/4x/8x part and leaves the rest there as
the the offsets. The {gather,scatter}_base_offsets_* functions take
an i32 scale factor, which is passed to them, and then they carefully
generate IR so that it hits LLVM's pattern matching for these scales.
This is particular win on AVX, since it saves us two 4-wide integer
multiplies.
Noise runs 14% faster with this.
Issue #132.
============================== Intel(r) SPMD Program Compiler ============================== Welcome to the Intel(r) SPMD Program Compiler (ispc)! ispc is a new compiler for "single program, multiple data" (SPMD) programs. Under the SPMD model, the programmer writes a program that mostly appears to be a regular serial program, though the execution model is actually that a number of program instances execute in parallel on the hardware. ispc compiles a C-based SPMD programming language to run on the SIMD units of CPUs; it frequently provides a a 3x or more speedup on CPUs with 4-wide SSE units, without any of the difficulty of writing intrinsics code. ispc is an open source compiler under the BSD license; see the file LICENSE.txt. ispc supports Windows, Mac, and Linux, with both x86 and x86-64 targets. It currently supports the SSE2, SSE4, and AVX instruction sets. For more information and examples, as well as a wiki and the bug database, see the ispc distribution site, http://ispc.github.com.
Description
Languages
C++
63.5%
LLVM
19.1%
M4
11.6%
Python
4.5%
Makefile
0.5%
Other
0.6%