Evghenii
7b2ceba128
added "internal" for helper functions to avoid them being exported to PTX
2014-02-05 17:02:05 +01:00
Evghenii
686c1d676d
improvements
2014-02-05 12:04:36 +01:00
Evghenii
d3a6693eef
adding __have_native_{rsqrtd,rcpd} to select between native support for double precision reciprocals and using slower but safe version in stdlib
2014-02-04 16:29:23 +01:00
evghenii
3a72e05c3e
+1
2014-02-02 18:16:48 +01:00
Evghenii
a3b00fdcd6
added support for global atomics
2014-01-26 14:23:26 +01:00
Evghenii
a7d4a3f922
fix for __any
2014-01-26 13:15:13 +01:00
Evghenii
ddb9b2fc47
added basic printing from ptx
2014-01-24 13:44:38 +01:00
Evghenii
be6ac0408a
added compile-time constant __is_nvptx_traget that can be used with stdlib.ispc
2014-01-24 09:02:12 +01:00
Evghenii
1cf1dab649
fixed foreach_unique and local_atomics
2014-01-23 21:57:20 +01:00
Evghenii
98fc43d859
Merge branch 'master' into nvptx
2014-01-21 20:05:27 +01:00
Ilia Filippov
aa31957d84
supporting LLVM trunk
2014-01-21 14:21:26 +04:00
Vsevolod Livinskij
da02236b3a
Scalar realization of no-vec functions was replaced from builtins to stdlib.ispc.
2014-01-20 16:06:34 +04:00
Evghenii
84134678dc
ISPC can emit LLVM PTX now
2014-01-10 07:53:09 +01:00
evghenii
9053eed4b4
added basic optimization pass that promotes uniform into varying variables (not array) for nvptx target
2014-01-10 06:32:57 +01:00
Evghenii
9b74e60185
added conversion from addrspace(3)/__local/__shared__ to addspace(0)/generic when PtrToInt is called
2014-01-07 14:29:55 +01:00
Evghenii
18a50aa679
further cleaning...
2014-01-06 14:34:28 +01:00
Evghenii
546f9cb409
MAJOR CHANGE--- STOP WITH THIS BRANCH--
2014-01-06 13:51:02 +01:00
Evghenii
d77789d8fe
+merged with master
2013-12-18 11:37:01 +01:00
Ilia Filippov
473f1cb4d2
packed_store_active2
2013-12-17 21:14:29 +04:00
Vsevolod Livinskij
35a4d1b3a2
Add some AVX2 intrinsics
2013-11-27 00:55:57 +04:00
Vsevolod Livinskij
19f73b2ede
uniform signed/unsigned int8/16
2013-11-25 19:16:02 +04:00
evghenii
bb46b561fd
Merged with upstream/master
2013-11-22 08:13:16 +01:00
Evghenii
918ca339b6
now programIndex returns laneIdx = %tid.x & (%warpsize-1) & programCount returns 32
2013-11-14 19:27:52 +01:00
Dmitry Babokin
65ea6fd48a
Reasoning to use sse4 bitcode file
2013-11-14 15:34:30 +04:00
Dmitry Babokin
ffc9a33933
avx1-i32x4 implementation as sse4-i32x4 with avx target-feature flag
2013-11-14 15:34:30 +04:00
Evghenii
4cd7e10ad3
reversed to original changes. Here is the plan to use CDP and genarate only device code with host wrapper..
2013-11-12 12:51:56 +01:00
egaburov
60881499dc
Merge branch 'nvptx' of github.com:egaburov/ispc into nvptx
2013-10-29 15:25:14 +01:00
egaburov
f19cf9274e
Merge remote-tracking branch 'upstream/master' into nvptx
2013-10-29 15:24:40 +01:00
Evghenii
ed9bca0e12
add __soa_to_aos*_float1 and __aos_to_soa*_float1 builtins
2013-10-29 15:06:08 +01:00
Dmitry Babokin
6585a925be
Merge pull request #641 from jbrodman/stdlibshift
...
Add a "shift" operator to the stdlib.
2013-10-28 14:18:31 -07:00
Evghenii
8391d05697
added blockIndex computations
2013-10-28 10:18:30 +01:00
Evghenii
ac095dbf3e
working on nvptx
2013-10-26 16:12:33 +02:00
james.brodman
899f85ce9c
Initial Support for new stdlib shift operator
2013-10-22 18:06:54 -04:00
egaburov
7e9b4c0924
added avx2-i64x4 and avx1.1-i64x4 targets
2013-10-15 10:02:10 +02:00
egaburov
5d56d29240
merged with master
2013-10-08 19:13:30 +02:00
Evghenii
9861375f0c
renamed avx-i64x4 -> avx1-i64x4
2013-09-13 15:07:14 +02:00
egaburov
7364e06387
added mask64
2013-09-12 12:02:42 +02:00
egaburov
9cf8e8cbf3
builtins fix for double precision svml and __stdlib_asin
2013-09-11 15:23:45 +02:00
egaburov
320c41ffcf
added svml support. experimental. for some reason all sybmols are visible..
2013-09-11 15:16:50 +02:00
egaburov
9c79d4d182
addded avxh with vectorWidth=4 support, use --target=avxh to enable it
2013-09-11 12:58:02 +02:00
james.brodman
8db378b265
Revert "Remove support for using SVML for math lib routines."
...
This reverts commit d9c38b5c1f .
2013-09-04 16:01:58 -04:00
Matt Pharr
cd9afe946c
Merge branch 'master' into arm
...
Conflicts:
Makefile
builtins.cpp
ispc.cpp
ispc.h
ispc.vcxproj
opt.cpp
2013-08-06 17:39:21 -07:00
Matt Pharr
1276ea9844
Revert "Remove support for building with LLVM 3.1"
...
This reverts commit d3c567503b .
Conflicts:
opt.cpp
2013-08-06 17:00:35 -07:00
Dmitry Babokin
dff7735af9
Fix for Windows build and making NEON target optional
2013-08-02 19:24:34 -07:00
Matt Pharr
d9c38b5c1f
Remove support for using SVML for math lib routines.
...
This path was poorly maintained and wasn't actually available on most
targets.
2013-07-31 06:56:48 -07:00
Matt Pharr
d3c567503b
Remove support for building with LLVM 3.1
2013-07-31 06:46:45 -07:00
Matt Pharr
48ff03112f
Remove __pause from stdlib_core() in utils.m4.
...
It wasn't ever being used, and was breaking compilation on ARM.
2013-07-30 08:44:22 -07:00
Matt Pharr
ab3b633733
Add 8-bit and 16-bit specialized NEON targets.
...
Like SSE4-8 and SSE4-16, these use 8-bit and 16-bit values for mask
elements, respectively, and thus should generate the best code when used
for computation with datatypes of those sizes.
2013-07-30 08:44:16 -07:00
egaburov
67b549a937
Added nvptx64 target. Things to do:
...
1. builtins/target-nvptx64.ll to write, now it is just a copy of target-generic-1.ll
2. add __global__ & __device__ scope
2. make code work for a single cuda thread
3. use tasks to work as a block grid and programIndex as laneIdx, programCount as warpSize
4. ... and more...
2013-07-28 14:31:43 +02:00
Matt Pharr
b6df447b55
Add reduce_add() for int8 and int16 types.
...
This maps to specialized instructions (e.g. PSADBW) when available.
2013-07-25 09:46:01 -07:00