Evghenii
84134678dc
ISPC can emit LLVM PTX now
2014-01-10 07:53:09 +01:00
Evghenii
91d4ae46f6
sort --fails
2014-01-06 15:38:30 +01:00
Evghenii
18a50aa679
further cleaning...
2014-01-06 14:34:28 +01:00
Evghenii
546f9cb409
MAJOR CHANGE--- STOP WITH THIS BRANCH--
2014-01-06 13:51:02 +01:00
Evghenii
cdb50beaa2
added nativeVector Alignement for nvptx64
2013-12-13 12:07:29 +01:00
Evghenii
ddfe782151
merged
2013-12-13 11:56:43 +01:00
Ilia Filippov
b19937c4dc
deleting isPrimitiveType()
2013-12-12 19:25:02 +04:00
Dmitry Babokin
2d2d14744b
Fixing --opt=force-aligned-memory for LLVM 3.3+
2013-12-04 19:00:02 +04:00
evghenii
bb46b561fd
Merged with upstream/master
2013-11-22 08:13:16 +01:00
Ilia Filippov
3fd9d5a025
support of LLVM 3.5
2013-11-21 19:09:43 +04:00
Dmitry Babokin
42e181112a
Add avx1-i32x4 to the list of supported targets
2013-11-14 16:21:30 +04:00
Dmitry Babokin
e100040f28
Fix bug with fail when --target=avx1.1-i32x8,avx2-i32x8 - avx11 is not a valid target anymore, need more complete string
2013-11-14 15:37:11 +04:00
Dmitry Babokin
ffc9a33933
avx1-i32x4 implementation as sse4-i32x4 with avx target-feature flag
2013-11-14 15:34:30 +04:00
Evghenii
9c7a842163
ptx has support for half-float
2013-11-11 12:25:47 +01:00
Evghenii
b3c68af40a
added volume rendering to run on GPU
2013-11-08 13:57:16 +01:00
Evghenii
47cc470bf6
change nativeVectorWidth from 1 -> 32 for nvptx64
2013-10-29 16:07:12 +01:00
egaburov
60881499dc
Merge branch 'nvptx' of github.com:egaburov/ispc into nvptx
2013-10-29 15:25:14 +01:00
egaburov
f19cf9274e
Merge remote-tracking branch 'upstream/master' into nvptx
2013-10-29 15:24:40 +01:00
Evghenii
b31fc6f66d
now can generate both targets for npvtx64. m_isPTX is set true, to distuish when to either skip or exlcusive euse export
2013-10-29 14:17:11 +01:00
Evghenii
ac700d4860
checkpoint
2013-10-29 13:36:31 +01:00
Evghenii
b2baa35c3d
added correct datalayout for nvptx64
2013-10-29 11:34:01 +01:00
Dmitry Babokin
362ee06b9f
Typo fix
2013-10-29 01:35:26 +04:00
Dmitry Babokin
a166eb7ea1
Check AVX OS support in host cpu check code
2013-10-28 22:41:23 +04:00
Evghenii
4f486333ed
now nvptx allows extern "C" task void, which is emits a kernel that should (?) be callable by driver API from external code
2013-10-28 16:47:40 +01:00
Evghenii
ac095dbf3e
working on nvptx
2013-10-26 16:12:33 +02:00
egaburov
7e9b4c0924
added avx2-i64x4 and avx1.1-i64x4 targets
2013-10-15 10:02:10 +02:00
egaburov
8808a8cc9c
Merge remote-tracking branch 'upstream/master' into nvptx
2013-10-13 13:03:00 +02:00
Matt Pharr
e751977b72
Fix small typo for NEON targets in Target::SupportedTargets
2013-10-12 06:15:57 -07:00
egaburov
5d56d29240
merged with master
2013-10-08 19:13:30 +02:00
Dmitry Babokin
758efebb3c
Add missing testing support for avx1-i64x4 target
2013-09-30 17:54:59 +04:00
Preston Gurd
4b26b8b430
Remove redundant "slm".
2013-09-20 16:44:01 -04:00
Preston Gurd
9e0e9dbecc
- Add Silvermont (--cpu=slm) option for llvm 3.4+.
...
- Change default Sandybridge isa name to avx1-i32x8 from avx-i32x8,
to conform with replacement of avx-i32x8 by avx1-i32x8 everywhere else.
- Add "target-cpu" attribute, when using AttrBuilder, to correct a problem
whereby llvm would switch from the command line cpu setting
to the native (auto-detected) cpu setting on second and subsequent
functions. e.g. if I wanted to build for Silvermont on a Sandy Bridge
machine, ispc/llvm would correctly use Silvermont and turn on the
Silvermont scheduler. For the second and subsequent functions,
it would auto-detect Sandy Bridge, but still run the Silvermont
scheduler.
2013-09-20 14:42:46 -04:00
Dmitry Babokin
ce99b17616
Fix for Windows buils to include new target: avx-i64x4
2013-09-14 02:00:23 +04:00
Evghenii
9861375f0c
renamed avx-i64x4 -> avx1-i64x4
2013-09-13 15:07:14 +02:00
Evghenii
40af8d6ed5
fixed segfault in tests/launch-*.ispc. nativeVectoWidth in avx-i64x4 was set to 4. Fixed
2013-09-12 20:25:44 +02:00
egaburov
7364e06387
added mask64
2013-09-12 12:02:42 +02:00
egaburov
9c79d4d182
addded avxh with vectorWidth=4 support, use --target=avxh to enable it
2013-09-11 12:58:02 +02:00
james.brodman
28080b0c22
Fix build against 3.4
2013-08-27 16:56:00 -04:00
james.brodman
be3a40e70b
Fix for 3.4
2013-08-27 15:15:16 -04:00
Matt Pharr
0c5742b6f8
Implement new naming scheme for --target.
...
Now targets are named like "<isa>-i<mask size>x<gang size>", e.g.
"sse4-i8x16", or "avx2-i32x16".
The old target names are still supported.
2013-08-08 19:23:44 -07:00
Matt Pharr
cd9afe946c
Merge branch 'master' into arm
...
Conflicts:
Makefile
builtins.cpp
ispc.cpp
ispc.h
ispc.vcxproj
opt.cpp
2013-08-06 17:39:21 -07:00
Matt Pharr
1276ea9844
Revert "Remove support for building with LLVM 3.1"
...
This reverts commit d3c567503b .
Conflicts:
opt.cpp
2013-08-06 17:00:35 -07:00
Dmitry Babokin
dff7735af9
Fix for Windows build and making NEON target optional
2013-08-02 19:24:34 -07:00
Ilia Filippov
a174a90f86
Supporting dumping, switching off and debug printing of optimization phases
2013-08-01 11:37:52 +04:00
Matt Pharr
4f48d3258a
Documentation updates for NEON
2013-07-31 20:06:04 -07:00
Matt Pharr
d3c567503b
Remove support for building with LLVM 3.1
2013-07-31 06:46:45 -07:00
Matt Pharr
ab3b633733
Add 8-bit and 16-bit specialized NEON targets.
...
Like SSE4-8 and SSE4-16, these use 8-bit and 16-bit values for mask
elements, respectively, and thus should generate the best code when used
for computation with datatypes of those sizes.
2013-07-30 08:44:16 -07:00
egaburov
67b549a937
Added nvptx64 target. Things to do:
...
1. builtins/target-nvptx64.ll to write, now it is just a copy of target-generic-1.ll
2. add __global__ & __device__ scope
2. make code work for a single cuda thread
3. use tasks to work as a block grid and programIndex as laneIdx, programCount as warpSize
4. ... and more...
2013-07-28 14:31:43 +02:00
Matt Pharr
780b0dfe47
Add SSE4-16 target.
...
Along the lines of sse4-8, this is an 8-wide target for SSE4, using
16-bit elements for the mask. It's thus (in principle) the best
target for SIMD computation with 16-bit datatypes.
2013-07-25 09:46:01 -07:00
Matt Pharr
53414f12e6
Add SSE4 target optimized for computation with 8-bit datatypes.
...
This change adds a new 'sse4-8' target, where programCount is 16 and
the mask element size is 8-bits. (i.e. the most appropriate sizing of
the mask for SIMD computation with 8-bit datatypes.)
2013-07-23 17:30:32 -07:00