9 Commits

Author SHA1 Message Date
Dmitry Babokin
1415f5b0b1 Adding avx512knl-32x16 target to examples Makefiles 2016-02-25 14:16:31 +03:00
Ilia Filippov
4d05ec0e1e supporting VS2012 for all examples 2014-03-04 18:19:25 +04:00
Dmitry Babokin
9ef9f0bf32 Migrating to VS solution files to VS2012 2014-02-28 20:01:34 +04:00
james.brodman
bdfcc615ea change if/for order. 2014-01-13 11:40:58 -05:00
james.brodman
34b412bdf8 Add docs and example 2014-01-10 13:55:32 -05:00
Dmitry Babokin
017e7890f7 Examples makefiles to support setting single target via ISPC_IA_TARGETS 2013-11-14 15:34:30 +04:00
Ilia Filippov
a910bfb539 Windows support 2013-11-05 16:31:01 +04:00
Matt Pharr
d7b0c5794e Add support for ARM NEON targets.
Initial support for ARM NEON on Cortex-A9 and A15 CPUs.  All but ~10 tests
pass, and all examples compile and run correctly.  Most of the examples
show a ~2x speedup on a single A15 core versus scalar code.

Current open issues/TODOs
- Code quality looks decent, but hasn't been carefully examined.  Known
  issues/opportunities for improvement include:
  - fp32 vector divide is done as a series of scalar divides rather than
    a vector divide (which I believe exists, but I may be mistaken.)
    This is particularly harmful to examples/rt, which only runs ~1.5x
    faster with ispc, likely due to long chains of scalar divides.
  - The compiler isn't generating a vmin.f32 for e.g. the final scalar
    min in reduce_min(); instead it's generating a compare and then a
    select instruction (and similarly elsewhere).
  - There are some additional FIXMEs in builtins/target-neon.ll that
    include both a few pieces of missing functionality (e.g. rounding
    doubles) as well as places that deserve attention for possible
    code quality improvements.

- Currently only the "cortex-a9" and "cortex-15" CPU targets are
  supported; LLVM supports many other ARM CPUs and ispc should provide
  access to all of the ones that have NEON support (and aren't too
  obscure.)

- ~5 of the reduce-* tests hit an assertion inside LLVM (unfortunately
   only when the compiler runs on an ARM host, though).

- The Windows build hasn't been tested (though I've tried to update
  ispc.vcxproj appropriately).  It may just work, but will more likely
  have various small issues.)

- Anything related to 64-bit ARM has seen no attention.
2013-07-19 23:07:24 -07:00
Matt Pharr
fe2d9aa600 Add perfbench to examples: a few small microbenchmarks. 2012-02-10 12:27:13 -08:00