aaron/ispc - ispc - git.frat.tech

aaron/ispc

Author	SHA1	Message	Date
Andrey Guskov	ae8b724d92	Added LLVM 3.7 support	2015-01-19 17:30:59 +03:00
Dmitry Babokin	9c0b4cb8b3	Bumping version to 1.8.2dev	2015-01-12 14:49:48 +03:00
Dmitry Babokin	a095fd622f	Bumping version to 1.8.1	2014-12-31 14:20:38 +03:00
Dmitry Babokin	aeaef1bedf	Fixing problem with autodispatch when compiled with NVPTX target	2014-12-31 12:34:12 +03:00
Dmitry Babokin	ffa4b9e65c	Bumping version to 1.8.1dev	2014-10-17 03:25:14 +04:00
Dmitry Babokin	156f4e0fd8	Bumping version to 1.8.0	2014-10-16 23:34:38 +04:00
evghenii	9238c72e08	Merge branch 'master' into nvptx_clean_master	2014-10-14 14:27:00 +02:00
Vsevolod Livinskiy	0a6eb61ad0	Extend gather-scatter optimization with prefetch optimization	2014-10-02 15:21:43 +04:00
evghenii	8745888ce9	merged with master	2014-08-11 10:04:54 +02:00
Anton Mitrokhin	d0c9b7c9b5	wiped out all LLVM 3.1 support	2014-08-01 14:54:08 +04:00
Anton Mitrokhin	725be222ac	added LLVM_3_6 var	2014-07-30 11:50:15 +04:00
evghenii	b3c5a9c4d6	added #ifdef ISPC_NVPTX_ENALED ... #endif guards	2014-07-09 12:32:18 +02:00
evghenii	69f3898a61	Merge branch 'master' into nvptx_merge	2014-07-07 16:30:12 +02:00
Dmitry Babokin	eb8e94627d	Bumping version to 1.7.1dev	2014-04-18 20:44:00 +04:00
Dmitry Babokin	d63a94300c	v1.7.0	2014-04-18 00:11:44 +04:00
Evghenii	4641a15287	Merge branch 'master' into nvptx	2014-03-19 10:53:07 +01:00
Dmitry Babokin	31b95b665b	Copyright update	2014-03-12 20:19:16 +04:00
Evghenii	ac05de6835	merged with master	2014-02-21 08:25:28 +01:00
Evghenii	4196c723eb	merged with nvptx	2014-02-20 11:01:58 +01:00
Evghenii	668645fcda	first commit	2014-02-07 11:05:36 +01:00
Evghenii	d3a6693eef	adding __have_native_{rsqrtd,rcpd} to select between native support for double precision reciprocals and using slower but safe version in stdlib	2014-02-04 16:29:23 +01:00
Evghenii	546f9cb409	MAJOR CHANGE--- STOP WITH THIS BRANCH--	2014-01-06 13:51:02 +01:00
Evghenii	2d8da306a1	merged with master	2013-12-25 21:32:34 +01:00
Dmitry Babokin	799e476b48	Bumping ISPC version to 1.6.1dev	2013-12-19 22:29:02 +04:00
Dmitry Babokin	040605a83c	Bumping up ispc version to 1.6.0	2013-12-19 21:17:42 +04:00
Evghenii	ddfe782151	merged	2013-12-13 11:56:43 +01:00
Dmitry Babokin	2d2d14744b	Fixing --opt=force-aligned-memory for LLVM 3.3+	2013-12-04 19:00:02 +04:00
evghenii	bb46b561fd	Merged with upstream/master	2013-11-22 08:13:16 +01:00
Ilia Filippov	3fd9d5a025	support of LLVM 3.5	2013-11-21 19:09:43 +04:00
Dmitry Babokin	e100040f28	Fix bug with fail when --target=avx1.1-i32x8,avx2-i32x8 - avx11 is not a valid target anymore, need more complete string	2013-11-14 15:37:11 +04:00
Dmitry Babokin	ffc9a33933	avx1-i32x4 implementation as sse4-i32x4 with avx target-feature flag	2013-11-14 15:34:30 +04:00
Evghenii	8db3d25844	moved PtxString to Globals	2013-10-30 21:05:22 +01:00
Evghenii	b31fc6f66d	now can generate both targets for npvtx64. m_isPTX is set true, to distuish when to either skip or exlcusive euse export	2013-10-29 14:17:11 +01:00
Evghenii	ac700d4860	checkpoint	2013-10-29 13:36:31 +01:00
egaburov	5d56d29240	merged with master	2013-10-08 19:13:30 +02:00
Dmitry Babokin	3b4cc90800	Changing ISPC to 1.5.dev	2013-09-28 01:32:00 +04:00
Dmitry Babokin	8a39af8f72	Release 1.5.0	2013-09-27 23:27:05 +04:00
james.brodman	8db378b265	Revert "Remove support for using SVML for math lib routines." This reverts commit `d9c38b5c1f`.	2013-09-04 16:01:58 -04:00
Matt Pharr	0c5742b6f8	Implement new naming scheme for --target. Now targets are named like "<isa>-i<mask size>x<gang size>", e.g. "sse4-i8x16", or "avx2-i32x16". The old target names are still supported.	2013-08-08 19:23:44 -07:00
Matt Pharr	cd9afe946c	Merge branch 'master' into arm Conflicts: Makefile builtins.cpp ispc.cpp ispc.h ispc.vcxproj opt.cpp	2013-08-06 17:39:21 -07:00
Matt Pharr	1276ea9844	Revert "Remove support for building with LLVM 3.1" This reverts commit `d3c567503b`. Conflicts: opt.cpp	2013-08-06 17:00:35 -07:00
Dmitry Babokin	dff7735af9	Fix for Windows build and making NEON target optional	2013-08-02 19:24:34 -07:00
Ilia Filippov	a174a90f86	Supporting dumping, switching off and debug printing of optimization phases	2013-08-01 11:37:52 +04:00
Matt Pharr	d9c38b5c1f	Remove support for using SVML for math lib routines. This path was poorly maintained and wasn't actually available on most targets.	2013-07-31 06:56:48 -07:00
Matt Pharr	d3c567503b	Remove support for building with LLVM 3.1	2013-07-31 06:46:45 -07:00
Matt Pharr	ab3b633733	Add 8-bit and 16-bit specialized NEON targets. Like SSE4-8 and SSE4-16, these use 8-bit and 16-bit values for mask elements, respectively, and thus should generate the best code when used for computation with datatypes of those sizes.	2013-07-30 08:44:16 -07:00
egaburov	67b549a937	Added nvptx64 target. Things to do: 1. builtins/target-nvptx64.ll to write, now it is just a copy of target-generic-1.ll 2. add __global__ & __device__ scope 2. make code work for a single cuda thread 3. use tasks to work as a block grid and programIndex as laneIdx, programCount as warpSize 4. ... and more...	2013-07-28 14:31:43 +02:00
Matt Pharr	d7b0c5794e	Add support for ARM NEON targets. Initial support for ARM NEON on Cortex-A9 and A15 CPUs. All but ~10 tests pass, and all examples compile and run correctly. Most of the examples show a ~2x speedup on a single A15 core versus scalar code. Current open issues/TODOs - Code quality looks decent, but hasn't been carefully examined. Known issues/opportunities for improvement include: - fp32 vector divide is done as a series of scalar divides rather than a vector divide (which I believe exists, but I may be mistaken.) This is particularly harmful to examples/rt, which only runs ~1.5x faster with ispc, likely due to long chains of scalar divides. - The compiler isn't generating a vmin.f32 for e.g. the final scalar min in reduce_min(); instead it's generating a compare and then a select instruction (and similarly elsewhere). - There are some additional FIXMEs in builtins/target-neon.ll that include both a few pieces of missing functionality (e.g. rounding doubles) as well as places that deserve attention for possible code quality improvements. - Currently only the "cortex-a9" and "cortex-15" CPU targets are supported; LLVM supports many other ARM CPUs and ispc should provide access to all of the ones that have NEON support (and aren't too obscure.) - ~5 of the reduce-* tests hit an assertion inside LLVM (unfortunately only when the compiler runs on an ARM host, though). - The Windows build hasn't been tested (though I've tried to update ispc.vcxproj appropriately). It may just work, but will more likely have various small issues.) - Anything related to 64-bit ARM has seen no attention.	2013-07-19 23:07:24 -07:00
Dmitry Babokin	922895de69	Changing ISPC version to 1.4.5dev	2013-07-19 18:47:43 -07:00
Dmitry Babokin	28f0bce9f2	Release 1.4.4	2013-07-19 16:22:10 -07:00

1 2 3

130 Commits