Commit Graph

130 Commits

Author SHA1 Message Date
Andrey Guskov
ae8b724d92 Added LLVM 3.7 support 2015-01-19 17:30:59 +03:00
Dmitry Babokin
9c0b4cb8b3 Bumping version to 1.8.2dev 2015-01-12 14:49:48 +03:00
Dmitry Babokin
a095fd622f Bumping version to 1.8.1 2014-12-31 14:20:38 +03:00
Dmitry Babokin
aeaef1bedf Fixing problem with autodispatch when compiled with NVPTX target 2014-12-31 12:34:12 +03:00
Dmitry Babokin
ffa4b9e65c Bumping version to 1.8.1dev 2014-10-17 03:25:14 +04:00
Dmitry Babokin
156f4e0fd8 Bumping version to 1.8.0 2014-10-16 23:34:38 +04:00
evghenii
9238c72e08 Merge branch 'master' into nvptx_clean_master 2014-10-14 14:27:00 +02:00
Vsevolod Livinskiy
0a6eb61ad0 Extend gather-scatter optimization with prefetch optimization 2014-10-02 15:21:43 +04:00
evghenii
8745888ce9 merged with master 2014-08-11 10:04:54 +02:00
Anton Mitrokhin
d0c9b7c9b5 wiped out all LLVM 3.1 support 2014-08-01 14:54:08 +04:00
Anton Mitrokhin
725be222ac added LLVM_3_6 var 2014-07-30 11:50:15 +04:00
evghenii
b3c5a9c4d6 added #ifdef ISPC_NVPTX_ENALED ... #endif guards 2014-07-09 12:32:18 +02:00
evghenii
69f3898a61 Merge branch 'master' into nvptx_merge 2014-07-07 16:30:12 +02:00
Dmitry Babokin
eb8e94627d Bumping version to 1.7.1dev 2014-04-18 20:44:00 +04:00
Dmitry Babokin
d63a94300c v1.7.0 2014-04-18 00:11:44 +04:00
Evghenii
4641a15287 Merge branch 'master' into nvptx 2014-03-19 10:53:07 +01:00
Dmitry Babokin
31b95b665b Copyright update 2014-03-12 20:19:16 +04:00
Evghenii
ac05de6835 merged with master 2014-02-21 08:25:28 +01:00
Evghenii
4196c723eb merged with nvptx 2014-02-20 11:01:58 +01:00
Evghenii
668645fcda first commit 2014-02-07 11:05:36 +01:00
Evghenii
d3a6693eef adding __have_native_{rsqrtd,rcpd} to select between native support for double precision reciprocals and using slower but safe version in stdlib 2014-02-04 16:29:23 +01:00
Evghenii
546f9cb409 MAJOR CHANGE--- STOP WITH THIS BRANCH-- 2014-01-06 13:51:02 +01:00
Evghenii
2d8da306a1 merged with master 2013-12-25 21:32:34 +01:00
Dmitry Babokin
799e476b48 Bumping ISPC version to 1.6.1dev 2013-12-19 22:29:02 +04:00
Dmitry Babokin
040605a83c Bumping up ispc version to 1.6.0 2013-12-19 21:17:42 +04:00
Evghenii
ddfe782151 merged 2013-12-13 11:56:43 +01:00
Dmitry Babokin
2d2d14744b Fixing --opt=force-aligned-memory for LLVM 3.3+ 2013-12-04 19:00:02 +04:00
evghenii
bb46b561fd Merged with upstream/master 2013-11-22 08:13:16 +01:00
Ilia Filippov
3fd9d5a025 support of LLVM 3.5 2013-11-21 19:09:43 +04:00
Dmitry Babokin
e100040f28 Fix bug with fail when --target=avx1.1-i32x8,avx2-i32x8 - avx11 is not a valid target anymore, need more complete string 2013-11-14 15:37:11 +04:00
Dmitry Babokin
ffc9a33933 avx1-i32x4 implementation as sse4-i32x4 with avx target-feature flag 2013-11-14 15:34:30 +04:00
Evghenii
8db3d25844 moved PtxString to Globals 2013-10-30 21:05:22 +01:00
Evghenii
b31fc6f66d now can generate both targets for npvtx64. m_isPTX is set true, to distuish when to either skip or exlcusive euse export 2013-10-29 14:17:11 +01:00
Evghenii
ac700d4860 checkpoint 2013-10-29 13:36:31 +01:00
egaburov
5d56d29240 merged with master 2013-10-08 19:13:30 +02:00
Dmitry Babokin
3b4cc90800 Changing ISPC to 1.5.dev 2013-09-28 01:32:00 +04:00
Dmitry Babokin
8a39af8f72 Release 1.5.0 2013-09-27 23:27:05 +04:00
james.brodman
8db378b265 Revert "Remove support for using SVML for math lib routines."
This reverts commit d9c38b5c1f.
2013-09-04 16:01:58 -04:00
Matt Pharr
0c5742b6f8 Implement new naming scheme for --target.
Now targets are named like "<isa>-i<mask size>x<gang size>", e.g.
"sse4-i8x16", or "avx2-i32x16".

The old target names are still supported.
2013-08-08 19:23:44 -07:00
Matt Pharr
cd9afe946c Merge branch 'master' into arm
Conflicts:
	Makefile
	builtins.cpp
	ispc.cpp
	ispc.h
	ispc.vcxproj
	opt.cpp
2013-08-06 17:39:21 -07:00
Matt Pharr
1276ea9844 Revert "Remove support for building with LLVM 3.1"
This reverts commit d3c567503b.

Conflicts:
	opt.cpp
2013-08-06 17:00:35 -07:00
Dmitry Babokin
dff7735af9 Fix for Windows build and making NEON target optional 2013-08-02 19:24:34 -07:00
Ilia Filippov
a174a90f86 Supporting dumping, switching off and debug printing of optimization phases 2013-08-01 11:37:52 +04:00
Matt Pharr
d9c38b5c1f Remove support for using SVML for math lib routines.
This path was poorly maintained and wasn't actually available on most
targets.
2013-07-31 06:56:48 -07:00
Matt Pharr
d3c567503b Remove support for building with LLVM 3.1 2013-07-31 06:46:45 -07:00
Matt Pharr
ab3b633733 Add 8-bit and 16-bit specialized NEON targets.
Like SSE4-8 and SSE4-16, these use 8-bit and 16-bit values for mask
elements, respectively, and thus should generate the best code when used
for computation with datatypes of those sizes.
2013-07-30 08:44:16 -07:00
egaburov
67b549a937 Added nvptx64 target. Things to do:
1. builtins/target-nvptx64.ll to write, now it is just a copy of target-generic-1.ll
2. add __global__ & __device__ scope
2. make code work for a single cuda thread
3. use tasks to work as a block grid and programIndex as laneIdx, programCount as warpSize
4. ... and more...
2013-07-28 14:31:43 +02:00
Matt Pharr
d7b0c5794e Add support for ARM NEON targets.
Initial support for ARM NEON on Cortex-A9 and A15 CPUs.  All but ~10 tests
pass, and all examples compile and run correctly.  Most of the examples
show a ~2x speedup on a single A15 core versus scalar code.

Current open issues/TODOs
- Code quality looks decent, but hasn't been carefully examined.  Known
  issues/opportunities for improvement include:
  - fp32 vector divide is done as a series of scalar divides rather than
    a vector divide (which I believe exists, but I may be mistaken.)
    This is particularly harmful to examples/rt, which only runs ~1.5x
    faster with ispc, likely due to long chains of scalar divides.
  - The compiler isn't generating a vmin.f32 for e.g. the final scalar
    min in reduce_min(); instead it's generating a compare and then a
    select instruction (and similarly elsewhere).
  - There are some additional FIXMEs in builtins/target-neon.ll that
    include both a few pieces of missing functionality (e.g. rounding
    doubles) as well as places that deserve attention for possible
    code quality improvements.

- Currently only the "cortex-a9" and "cortex-15" CPU targets are
  supported; LLVM supports many other ARM CPUs and ispc should provide
  access to all of the ones that have NEON support (and aren't too
  obscure.)

- ~5 of the reduce-* tests hit an assertion inside LLVM (unfortunately
   only when the compiler runs on an ARM host, though).

- The Windows build hasn't been tested (though I've tried to update
  ispc.vcxproj appropriately).  It may just work, but will more likely
  have various small issues.)

- Anything related to 64-bit ARM has seen no attention.
2013-07-19 23:07:24 -07:00
Dmitry Babokin
922895de69 Changing ISPC version to 1.4.5dev 2013-07-19 18:47:43 -07:00
Dmitry Babokin
28f0bce9f2 Release 1.4.4 2013-07-19 16:22:10 -07:00