evghenii
|
ef9e212eec
|
Merge remote-tracking branch 'upstream/master' into nvptx
|
2013-11-26 13:24:43 +01:00 |
|
Dmitry Babokin
|
b4102a4510
|
Merge pull request #665 from ifilippov/master
fix of perf.py
|
2013-11-22 06:36:22 -08:00 |
|
Ilia Filippov
|
18f90e6339
|
fix of perf.py
|
2013-11-22 17:06:19 +04:00 |
|
Evghenii
|
828a5d45cd
|
Merge remote-tracking branch 'upstream/master' into nvptx
|
2013-11-22 08:10:08 +01:00 |
|
Dmitry Babokin
|
0f7ac1cc90
|
Merge pull request #664 from ifilippov/3_5
Support for LLVM 3.5
|
2013-11-21 07:35:35 -08:00 |
|
Dmitry Babokin
|
019ff4709c
|
Merge pull request #663 from ifilippov/perf
Adding multiple targets in perf.py
|
2013-11-21 07:11:35 -08:00 |
|
Ilia Filippov
|
3fd9d5a025
|
support of LLVM 3.5
|
2013-11-21 19:09:43 +04:00 |
|
Ilia Filippov
|
924858509d
|
checking targets in perf.py
|
2013-11-21 19:05:35 +04:00 |
|
jbrodman
|
357f115f11
|
Merge pull request #661 from dbabokin/task_diagnostics
Fix task system dignostic to report real reason of the symaphore allocat...
|
2013-11-20 09:16:51 -08:00 |
|
Dmitry Babokin
|
5531586c35
|
Fix for existing semaphore problem
|
2013-11-20 19:19:15 +04:00 |
|
Dmitry Babokin
|
40da411fa5
|
Fix task system dignostic to report real reason of the symaphore allocation fail
|
2013-11-20 17:22:50 +04:00 |
|
Dmitry Babokin
|
676c367db1
|
Merge pull request #660 from dbabokin/fail_db
fail_db.txt update on Linux with new passes
|
2013-11-19 09:20:30 -08:00 |
|
Dmitry Babokin
|
5722d17924
|
fail_db.txt update on Linux with new passes
|
2013-11-19 21:17:54 +04:00 |
|
Ilia Filippov
|
97298eb112
|
multiple targets in perf.py
|
2013-11-19 17:37:52 +04:00 |
|
Dmitry Babokin
|
754a3208f2
|
Merge pull request #659 from ifilippov/master
patch for LLVM 3.3 and test correction at avx2
|
2013-11-18 02:46:15 -08:00 |
|
Ilia Filippov
|
4579d339ea
|
patch for LLVM 3.3 and test correction at avx2
|
2013-11-18 13:53:21 +04:00 |
|
Dmitry Babokin
|
4977933d81
|
Merge pull request #658 from dbabokin/fail_db
fail_db.txt update on Linux
|
2013-11-17 15:42:51 -08:00 |
|
Dmitry Babokin
|
953e467a85
|
fail_db.txt update on Linux
|
2013-11-18 03:39:09 +04:00 |
|
jbrodman
|
131ab07c2b
|
Merge pull request #657 from dbabokin/avx-i32x4
avx1-i32x4 target
|
2013-11-15 16:00:57 -08:00 |
|
Dmitry Babokin
|
131ff50333
|
Adding avx1-i32x4 to alloy.py testing
|
2013-11-15 22:09:13 +04:00 |
|
Dmitry Babokin
|
42e181112a
|
Add avx1-i32x4 to the list of supported targets
|
2013-11-14 16:21:30 +04:00 |
|
Dmitry Babokin
|
801f78f8a8
|
Rebuild *.ispc when necessary
|
2013-11-14 15:37:11 +04:00 |
|
Dmitry Babokin
|
e100040f28
|
Fix bug with fail when --target=avx1.1-i32x8,avx2-i32x8 - avx11 is not a valid target anymore, need more complete string
|
2013-11-14 15:37:11 +04:00 |
|
Dmitry Babokin
|
b8a39a1b26
|
minor improvements in examples/common.mk
|
2013-11-14 15:37:10 +04:00 |
|
Dmitry Babokin
|
8f768633ad
|
Make perf.py changes work as part of alloy.py
|
2013-11-14 15:37:10 +04:00 |
|
Dmitry Babokin
|
65ea6fd48a
|
Reasoning to use sse4 bitcode file
|
2013-11-14 15:34:30 +04:00 |
|
Dmitry Babokin
|
d2c7b356cc
|
Ordering functions in target-[avx|sse2].ll to be in the same order. No real changes, except adding a few alwaysinline in SSE4 target
|
2013-11-14 15:34:30 +04:00 |
|
Dmitry Babokin
|
af58955140
|
target-[sse4|avx]_common.ll are twin brothers, which diffes only cosmetically. This commit makes them diffable. No real changes, except adding alwaysinline to sse version iof __max_uniform_int32/__max_uniform_uint32
|
2013-11-14 15:34:30 +04:00 |
|
Dmitry Babokin
|
ffc9a33933
|
avx1-i32x4 implementation as sse4-i32x4 with avx target-feature flag
|
2013-11-14 15:34:30 +04:00 |
|
Dmitry Babokin
|
fbab9874f6
|
perf.py - target switch was added
|
2013-11-14 15:34:30 +04:00 |
|
Dmitry Babokin
|
017e7890f7
|
Examples makefiles to support setting single target via ISPC_IA_TARGETS
|
2013-11-14 15:34:30 +04:00 |
|
Evghenii
|
cddddfd255
|
+1
|
2013-11-13 16:23:24 +01:00 |
|
Evghenii
|
6cd8a8f895
|
cleaned-up
|
2013-11-13 13:47:53 +01:00 |
|
Evghenii
|
6a1fb8ea31
|
some kernel tuning
|
2013-11-11 14:24:13 +01:00 |
|
Evghenii
|
f2c66dc4c3
|
added any/none/all for bool
|
2013-11-11 12:59:40 +01:00 |
|
Evghenii
|
a91c8e15e2
|
added reduce_min/max_float, packed_store_active for CUDA, and now kerenls1.ispc just work :)
|
2013-11-11 12:33:39 +01:00 |
|
Evghenii
|
9c7a842163
|
ptx has support for half-float
|
2013-11-11 12:25:47 +01:00 |
|
Evghenii
|
3dd6173a65
|
added packed_store_active that can be called with active flag
|
2013-11-11 12:25:15 +01:00 |
|
Evghenii
|
e9bc2b7b54
|
added uniform_new/uniform_delete in util_ptx.m4 and __shfl intrinsics
|
2013-11-11 09:18:15 +01:00 |
|
Evghenii
|
38947ab71b
|
made CU version working
|
2013-11-10 20:10:37 +01:00 |
|
Evghenii
|
8a7801264a
|
added tuned code
|
2013-11-10 16:02:10 +01:00 |
|
Evghenii
|
66edc180be
|
working on aobench
|
2013-11-10 14:29:53 +01:00 |
|
Evghenii
|
17809992d7
|
working on ao
|
2013-11-10 14:26:00 +01:00 |
|
evghenii
|
c10033211b
|
removed
|
2013-11-10 14:17:59 +01:00 |
|
Evghenii
|
7d4ea1b6f0
|
added wc-timer
|
2013-11-10 14:15:16 +01:00 |
|
Evghenii
|
0dfe823c32
|
added kernels that use shared memory
|
2013-11-10 14:06:06 +01:00 |
|
Evghenii
|
bef275f62c
|
amadded drv_api_error_String.h
|
2013-11-10 14:05:34 +01:00 |
|
evghenii
|
edb4c57e3d
|
+added host code as well and restored original main.cpp
|
2013-11-10 14:07:15 +01:00 |
|
evghenii
|
c1b3face8f
|
change time from sec to ms
|
2013-11-10 14:04:01 +01:00 |
|
Evghenii
|
9d23c10475
|
deffered_shading probilem identified. need solution
|
2013-11-10 13:59:41 +01:00 |
|