Commit Graph

1938 Commits

Author SHA1 Message Date
Evghenii
294fb039fe some tuning, adding cuda kernels 2013-11-14 22:33:58 +01:00
Evghenii
f12826bac5 +added approx rcp/rsqrt/rtz with ftz=true 2013-11-14 22:17:57 +01:00
Evghenii
2c8afde6d9 chaning MF 2013-11-14 21:38:25 +01:00
Evghenii
1445202e0e identified bug due to llvm-3.4 2013-11-14 21:18:25 +01:00
Evghenii
1b940fd41e +1 2013-11-14 20:19:59 +01:00
Evghenii
f1fc3bdfba added nvptx declaration to other target & fixed nvptx64 recognition 2013-11-14 20:12:58 +01:00
Evghenii
7aa37b19a9 added some more macros as quick hack... 2013-11-14 20:04:05 +01:00
Evghenii
967a49dd66 +1 2013-11-14 19:54:18 +01:00
Evghenii
25df23fed3 workaround for programIndex via preprocessor 2013-11-14 19:48:50 +01:00
Evghenii
e162d5a99d programIndex still not working, found where change is needed... 2013-11-14 19:46:08 +01:00
Evghenii
918ca339b6 now programIndex returns laneIdx = %tid.x & (%warpsize-1) & programCount returns 32 2013-11-14 19:27:52 +01:00
Evghenii
8bb8f0eda4 +1 2013-11-14 17:04:50 +01:00
Evghenii
be2cc8f946 restored foreach in sort 2013-11-14 16:51:59 +01:00
Evghenii
599ada8354 added deferred shading foreach_tile 2013-11-14 16:49:47 +01:00
Evghenii
83b9cc5c0a +1 2013-11-14 16:44:09 +01:00
Evghenii
af75afeb7a foreach[_tiled] seems to work now 2013-11-14 16:29:40 +01:00
Dmitry Babokin
42e181112a Add avx1-i32x4 to the list of supported targets 2013-11-14 16:21:30 +04:00
Dmitry Babokin
801f78f8a8 Rebuild *.ispc when necessary 2013-11-14 15:37:11 +04:00
Dmitry Babokin
e100040f28 Fix bug with fail when --target=avx1.1-i32x8,avx2-i32x8 - avx11 is not a valid target anymore, need more complete string 2013-11-14 15:37:11 +04:00
Dmitry Babokin
b8a39a1b26 minor improvements in examples/common.mk 2013-11-14 15:37:10 +04:00
Dmitry Babokin
8f768633ad Make perf.py changes work as part of alloy.py 2013-11-14 15:37:10 +04:00
Dmitry Babokin
65ea6fd48a Reasoning to use sse4 bitcode file 2013-11-14 15:34:30 +04:00
Dmitry Babokin
d2c7b356cc Ordering functions in target-[avx|sse2].ll to be in the same order. No real changes, except adding a few alwaysinline in SSE4 target 2013-11-14 15:34:30 +04:00
Dmitry Babokin
af58955140 target-[sse4|avx]_common.ll are twin brothers, which diffes only cosmetically. This commit makes them diffable. No real changes, except adding alwaysinline to sse version iof __max_uniform_int32/__max_uniform_uint32 2013-11-14 15:34:30 +04:00
Dmitry Babokin
ffc9a33933 avx1-i32x4 implementation as sse4-i32x4 with avx target-feature flag 2013-11-14 15:34:30 +04:00
Dmitry Babokin
fbab9874f6 perf.py - target switch was added 2013-11-14 15:34:30 +04:00
Dmitry Babokin
017e7890f7 Examples makefiles to support setting single target via ISPC_IA_TARGETS 2013-11-14 15:34:30 +04:00
Evghenii
48644813d4 stmt.cpp forking on foreach 2013-11-14 11:30:22 +01:00
evghenii
c81821ed28 +1 2013-11-13 21:17:21 +01:00
Evghenii
42cfe97427 using now cuda_ispc.h 2013-11-13 21:06:40 +01:00
Evghenii
09a2c12ea0 added cuda_ispc.h & cuda eror_strings 2013-11-13 21:04:59 +01:00
Evghenii
a0f6f264f6 fixed problem with new/delete and added Mel/sec counter 2013-11-13 20:34:01 +01:00
Evghenii
6f9cea5b58 removed binary 2013-11-13 19:43:45 +01:00
Evghenii
dd4ac42491 added print m 2013-11-13 19:43:32 +01:00
Evghenii
01df6ed4a9 added ispc timers w/o task 2013-11-13 19:13:04 +01:00
Evghenii
e71259006c +1 2013-11-13 19:06:02 +01:00
Evghenii
0f161b500f +1 2013-11-13 19:02:45 +01:00
Evghenii
e442139c39 runs, next check correctness 2013-11-13 18:15:52 +01:00
Evghenii
8b0f871c06 +1 2013-11-13 17:23:23 +01:00
Evghenii
61fab0340c working on sort 2013-11-13 17:07:55 +01:00
Evghenii
525eacd035 +1 2013-11-13 16:32:56 +01:00
Evghenii
cddddfd255 +1 2013-11-13 16:23:24 +01:00
Evghenii
780e9f31fe some tuning 2013-11-13 16:23:05 +01:00
Evghenii
c0b54aa58c added Makefile_gpu 2013-11-13 16:20:51 +01:00
Evghenii
c0c1cc1ba7 +added Makefile and some fixes 2013-11-13 14:16:48 +01:00
Evghenii
dededd1929 cleaned 2013-11-13 13:56:45 +01:00
Evghenii
6cd8a8f895 cleaned-up 2013-11-13 13:47:53 +01:00
Evghenii
d3ade0654e added Makefile 2013-11-13 13:45:24 +01:00
Evghenii
2dd7128db5 added Makefile 2013-11-13 13:40:08 +01:00
Evghenii
1f13a236bf small tuning 2013-11-13 13:03:26 +01:00