Commit Graph

1703 Commits

Author SHA1 Message Date
evghenii
ef9e212eec Merge remote-tracking branch 'upstream/master' into nvptx 2013-11-26 13:24:43 +01:00
Dmitry Babokin
b4102a4510 Merge pull request #665 from ifilippov/master
fix of perf.py
2013-11-22 06:36:22 -08:00
Ilia Filippov
18f90e6339 fix of perf.py 2013-11-22 17:06:19 +04:00
Evghenii
828a5d45cd Merge remote-tracking branch 'upstream/master' into nvptx 2013-11-22 08:10:08 +01:00
Dmitry Babokin
0f7ac1cc90 Merge pull request #664 from ifilippov/3_5
Support for LLVM 3.5
2013-11-21 07:35:35 -08:00
Dmitry Babokin
019ff4709c Merge pull request #663 from ifilippov/perf
Adding multiple targets in perf.py
2013-11-21 07:11:35 -08:00
Ilia Filippov
3fd9d5a025 support of LLVM 3.5 2013-11-21 19:09:43 +04:00
Ilia Filippov
924858509d checking targets in perf.py 2013-11-21 19:05:35 +04:00
jbrodman
357f115f11 Merge pull request #661 from dbabokin/task_diagnostics
Fix task system dignostic to report real reason of the symaphore allocat...
2013-11-20 09:16:51 -08:00
Dmitry Babokin
5531586c35 Fix for existing semaphore problem 2013-11-20 19:19:15 +04:00
Dmitry Babokin
40da411fa5 Fix task system dignostic to report real reason of the symaphore allocation fail 2013-11-20 17:22:50 +04:00
Dmitry Babokin
676c367db1 Merge pull request #660 from dbabokin/fail_db
fail_db.txt update on Linux with new passes
2013-11-19 09:20:30 -08:00
Dmitry Babokin
5722d17924 fail_db.txt update on Linux with new passes 2013-11-19 21:17:54 +04:00
Ilia Filippov
97298eb112 multiple targets in perf.py 2013-11-19 17:37:52 +04:00
Dmitry Babokin
754a3208f2 Merge pull request #659 from ifilippov/master
patch for LLVM 3.3 and test correction at avx2
2013-11-18 02:46:15 -08:00
Ilia Filippov
4579d339ea patch for LLVM 3.3 and test correction at avx2 2013-11-18 13:53:21 +04:00
Dmitry Babokin
4977933d81 Merge pull request #658 from dbabokin/fail_db
fail_db.txt update on Linux
2013-11-17 15:42:51 -08:00
Dmitry Babokin
953e467a85 fail_db.txt update on Linux 2013-11-18 03:39:09 +04:00
jbrodman
131ab07c2b Merge pull request #657 from dbabokin/avx-i32x4
avx1-i32x4 target
2013-11-15 16:00:57 -08:00
Dmitry Babokin
131ff50333 Adding avx1-i32x4 to alloy.py testing 2013-11-15 22:09:13 +04:00
Dmitry Babokin
42e181112a Add avx1-i32x4 to the list of supported targets 2013-11-14 16:21:30 +04:00
Dmitry Babokin
801f78f8a8 Rebuild *.ispc when necessary 2013-11-14 15:37:11 +04:00
Dmitry Babokin
e100040f28 Fix bug with fail when --target=avx1.1-i32x8,avx2-i32x8 - avx11 is not a valid target anymore, need more complete string 2013-11-14 15:37:11 +04:00
Dmitry Babokin
b8a39a1b26 minor improvements in examples/common.mk 2013-11-14 15:37:10 +04:00
Dmitry Babokin
8f768633ad Make perf.py changes work as part of alloy.py 2013-11-14 15:37:10 +04:00
Dmitry Babokin
65ea6fd48a Reasoning to use sse4 bitcode file 2013-11-14 15:34:30 +04:00
Dmitry Babokin
d2c7b356cc Ordering functions in target-[avx|sse2].ll to be in the same order. No real changes, except adding a few alwaysinline in SSE4 target 2013-11-14 15:34:30 +04:00
Dmitry Babokin
af58955140 target-[sse4|avx]_common.ll are twin brothers, which diffes only cosmetically. This commit makes them diffable. No real changes, except adding alwaysinline to sse version iof __max_uniform_int32/__max_uniform_uint32 2013-11-14 15:34:30 +04:00
Dmitry Babokin
ffc9a33933 avx1-i32x4 implementation as sse4-i32x4 with avx target-feature flag 2013-11-14 15:34:30 +04:00
Dmitry Babokin
fbab9874f6 perf.py - target switch was added 2013-11-14 15:34:30 +04:00
Dmitry Babokin
017e7890f7 Examples makefiles to support setting single target via ISPC_IA_TARGETS 2013-11-14 15:34:30 +04:00
Evghenii
cddddfd255 +1 2013-11-13 16:23:24 +01:00
Evghenii
6cd8a8f895 cleaned-up 2013-11-13 13:47:53 +01:00
Evghenii
6a1fb8ea31 some kernel tuning 2013-11-11 14:24:13 +01:00
Evghenii
f2c66dc4c3 added any/none/all for bool 2013-11-11 12:59:40 +01:00
Evghenii
a91c8e15e2 added reduce_min/max_float, packed_store_active for CUDA, and now kerenls1.ispc just work :) 2013-11-11 12:33:39 +01:00
Evghenii
9c7a842163 ptx has support for half-float 2013-11-11 12:25:47 +01:00
Evghenii
3dd6173a65 added packed_store_active that can be called with active flag 2013-11-11 12:25:15 +01:00
Evghenii
e9bc2b7b54 added uniform_new/uniform_delete in util_ptx.m4 and __shfl intrinsics 2013-11-11 09:18:15 +01:00
Evghenii
38947ab71b made CU version working 2013-11-10 20:10:37 +01:00
Evghenii
8a7801264a added tuned code 2013-11-10 16:02:10 +01:00
Evghenii
66edc180be working on aobench 2013-11-10 14:29:53 +01:00
Evghenii
17809992d7 working on ao 2013-11-10 14:26:00 +01:00
evghenii
c10033211b removed 2013-11-10 14:17:59 +01:00
Evghenii
7d4ea1b6f0 added wc-timer 2013-11-10 14:15:16 +01:00
Evghenii
0dfe823c32 added kernels that use shared memory 2013-11-10 14:06:06 +01:00
Evghenii
bef275f62c amadded drv_api_error_String.h 2013-11-10 14:05:34 +01:00
evghenii
edb4c57e3d +added host code as well and restored original main.cpp 2013-11-10 14:07:15 +01:00
evghenii
c1b3face8f change time from sec to ms 2013-11-10 14:04:01 +01:00
Evghenii
9d23c10475 deffered_shading probilem identified. need solution 2013-11-10 13:59:41 +01:00