Commit Graph

233 Commits

Author SHA1 Message Date
Andrey Guskov
2f2af816e7 3.7-related copyright update 2015-01-20 14:56:58 +03:00
Andrey Guskov
ae8b724d92 Added LLVM 3.7 support 2015-01-19 17:30:59 +03:00
evghenii
4e7ae5269b added pseudo_prefetch definitions 2014-10-14 14:48:02 +02:00
evghenii
9238c72e08 Merge branch 'master' into nvptx_clean_master 2014-10-14 14:27:00 +02:00
Vsevolod Livinskiy
eb61d5df72 Support for cache 2/3 and all targets 2014-10-02 16:25:23 +04:00
Vsevolod Livinskiy
0a6eb61ad0 Extend gather-scatter optimization with prefetch optimization 2014-10-02 15:21:43 +04:00
evghenii
8745888ce9 merged with master 2014-08-11 10:04:54 +02:00
Anton Mitrokhin
d0c9b7c9b5 wiped out all LLVM 3.1 support 2014-08-01 14:54:08 +04:00
Anton Mitrokhin
368d2f18f9 rewritten comment for util.m4 2014-07-30 16:43:41 +04:00
Anton Mitrokhin
7171701599 checked Makefile 'if' constructions, fixed ReleaseNotes.txt, added comments to util.m4 2014-07-30 16:25:39 +04:00
Anton Mitrokhin
725be222ac added LLVM_3_6 var 2014-07-30 11:50:15 +04:00
evghenii
2dbb4d9890 remove dependenace on llvm-dis from 3.2 2014-07-08 15:11:13 +02:00
evghenii
fe150c539f fix for exclusive_scan_and 2014-07-08 13:33:04 +02:00
evghenii
69f3898a61 Merge branch 'master' into nvptx_merge 2014-07-07 16:30:12 +02:00
Ilia Filippov
76ea59b40b support LLVM build 2014-06-18 17:53:42 +04:00
Evghenii
4641a15287 Merge branch 'master' into nvptx 2014-03-19 10:53:07 +01:00
Dmitry Babokin
31b95b665b Copyright update 2014-03-12 20:19:16 +04:00
Ilia Filippov
ead5cc741d support LLVM trunk after 203559 203213 and 203381 revisions 2014-03-12 12:58:50 +04:00
Ilia Filippov
6738af0a0c changing uniform_min and uniform_max implementations for avx targets 2014-03-06 12:05:24 +04:00
Evghenii
b60d77c154 use __nv_* libcalls for rcp/sqrt/rsqrt 2014-02-21 10:36:46 +01:00
Evghenii
ac05de6835 merged with master 2014-02-21 08:25:28 +01:00
Dmitry Babokin
f280b32fa4 Merge pull request #736 from egaburov/native_trigonometry
Native trigonometry
2014-02-20 19:18:35 +03:00
Evghenii
690a8acb30 merged with master 2014-02-20 15:22:09 +01:00
Evghenii
24e1a98275 compiles 2014-02-20 11:20:13 +01:00
Evghenii
4196c723eb merged with nvptx 2014-02-20 11:01:58 +01:00
Dmitry Babokin
ea0a514e03 Fix for generic-1 2014-02-11 15:33:23 +04:00
Vsevolod Livinskij
65d947e449 Else branch with error report was added 2014-02-10 15:18:48 +04:00
Vsevolod Livinskij
cef5b2eb04 Some changes in saturation arithmetic 2014-02-10 12:40:53 +04:00
Vsevolod Livinskij
1c1614d207 Some errors in comments and code were fixed 2014-02-09 21:39:42 +04:00
Evghenii
70a9b286e5 added support for native and double precision trigonometry/transendentals 2014-02-07 15:28:39 +01:00
Evghenii
14e76108cb optimization for _all 2014-02-06 14:24:50 +01:00
Evghenii
c23dd8a951 fixed __puts_nvptx 2014-02-05 17:48:04 +01:00
Evghenii
7b2ceba128 added "internal" for helper functions to avoid them being exported to PTX 2014-02-05 17:02:05 +01:00
evghenii
09e8381ec7 change {rsqrt,rcp}_double to {rsqrt,rcp}d_decl 2014-02-05 13:05:04 +01:00
Evghenii
686c1d676d improvements 2014-02-05 12:04:36 +01:00
Evghenii
048da693c5 fix sqrt 2014-02-05 10:52:08 +01:00
Evghenii
d3a6693eef adding __have_native_{rsqrtd,rcpd} to select between native support for double precision reciprocals and using slower but safe version in stdlib 2014-02-04 16:29:23 +01:00
Evghenii
fe98fe8cdc added fast approximate rcp(double) accurate to 15 digits 2014-02-04 15:23:34 +01:00
Evghenii
eb1a495a7a added support for fast approximate rsqrt(double). Provide 16 digit accurancy but is over 3x faster than 1/sqrt(double) 2014-02-04 14:44:54 +01:00
Evghenii
c2ed214a74 added declaretion for movmsk_ptx 2014-02-03 08:57:27 +01:00
Evghenii
98c82242c5 allowed static and disable memcpy/memmove/memset operations 2014-02-03 08:02:50 +01:00
evghenii
3a72e05c3e +1 2014-02-02 18:16:48 +01:00
Evghenii
673d814a45 first commit for __do_print in ptx. 2014-01-27 11:56:21 +01:00
Evghenii
a3b00fdcd6 added support for global atomics 2014-01-26 14:23:26 +01:00
Evghenii
a7d4a3f922 fix for __any 2014-01-26 13:15:13 +01:00
Evghenii
3e86dfe480 fix for __any 2014-01-25 17:09:11 +01:00
Evghenii
fcbdd93043 half/scan for 64 bit/clock/num_cores and other additions 2014-01-25 16:43:33 +01:00
Evghenii
805196a6a0 fixed doubles 2014-01-25 15:31:56 +01:00
Evghenii
bd34729217 added floor/ceil/round for float/double 2014-01-25 12:20:38 +01:00
Evghenii
6917c161c8 fixed reduce_equal 2014-01-25 11:39:37 +01:00