aaron/ispc - ispc - git.frat.tech

aaron/ispc

Author	SHA1	Message	Date
Evghenii	7b2ceba128	added "internal" for helper functions to avoid them being exported to PTX	2014-02-05 17:02:05 +01:00
Evghenii	686c1d676d	improvements	2014-02-05 12:04:36 +01:00
Evghenii	d3a6693eef	adding __have_native_{rsqrtd,rcpd} to select between native support for double precision reciprocals and using slower but safe version in stdlib	2014-02-04 16:29:23 +01:00
evghenii	3a72e05c3e	+1	2014-02-02 18:16:48 +01:00
Evghenii	a3b00fdcd6	added support for global atomics	2014-01-26 14:23:26 +01:00
Evghenii	a7d4a3f922	fix for __any	2014-01-26 13:15:13 +01:00
Evghenii	ddb9b2fc47	added basic printing from ptx	2014-01-24 13:44:38 +01:00
Evghenii	be6ac0408a	added compile-time constant __is_nvptx_traget that can be used with stdlib.ispc	2014-01-24 09:02:12 +01:00
Evghenii	1cf1dab649	fixed foreach_unique and local_atomics	2014-01-23 21:57:20 +01:00
Evghenii	98fc43d859	Merge branch 'master' into nvptx	2014-01-21 20:05:27 +01:00
Ilia Filippov	aa31957d84	supporting LLVM trunk	2014-01-21 14:21:26 +04:00
Vsevolod Livinskij	da02236b3a	Scalar realization of no-vec functions was replaced from builtins to stdlib.ispc.	2014-01-20 16:06:34 +04:00
Evghenii	84134678dc	ISPC can emit LLVM PTX now	2014-01-10 07:53:09 +01:00
evghenii	9053eed4b4	added basic optimization pass that promotes uniform into varying variables (not array) for nvptx target	2014-01-10 06:32:57 +01:00
Evghenii	9b74e60185	added conversion from addrspace(3)/__local/__shared__ to addspace(0)/generic when PtrToInt is called	2014-01-07 14:29:55 +01:00
Evghenii	18a50aa679	further cleaning...	2014-01-06 14:34:28 +01:00
Evghenii	546f9cb409	MAJOR CHANGE--- STOP WITH THIS BRANCH--	2014-01-06 13:51:02 +01:00
Evghenii	d77789d8fe	+merged with master	2013-12-18 11:37:01 +01:00
Ilia Filippov	473f1cb4d2	packed_store_active2	2013-12-17 21:14:29 +04:00
Vsevolod Livinskij	35a4d1b3a2	Add some AVX2 intrinsics	2013-11-27 00:55:57 +04:00
Vsevolod Livinskij	19f73b2ede	uniform signed/unsigned int8/16	2013-11-25 19:16:02 +04:00
evghenii	bb46b561fd	Merged with upstream/master	2013-11-22 08:13:16 +01:00
Evghenii	918ca339b6	now programIndex returns laneIdx = %tid.x & (%warpsize-1) & programCount returns 32	2013-11-14 19:27:52 +01:00
Dmitry Babokin	65ea6fd48a	Reasoning to use sse4 bitcode file	2013-11-14 15:34:30 +04:00
Dmitry Babokin	ffc9a33933	avx1-i32x4 implementation as sse4-i32x4 with avx target-feature flag	2013-11-14 15:34:30 +04:00
Evghenii	4cd7e10ad3	reversed to original changes. Here is the plan to use CDP and genarate only device code with host wrapper..	2013-11-12 12:51:56 +01:00
egaburov	60881499dc	Merge branch 'nvptx' of github.com:egaburov/ispc into nvptx	2013-10-29 15:25:14 +01:00
egaburov	f19cf9274e	Merge remote-tracking branch 'upstream/master' into nvptx	2013-10-29 15:24:40 +01:00
Evghenii	ed9bca0e12	add __soa_to_aos_float1 and __aos_to_soa_float1 builtins	2013-10-29 15:06:08 +01:00
Dmitry Babokin	6585a925be	Merge pull request #641 from jbrodman/stdlibshift Add a "shift" operator to the stdlib.	2013-10-28 14:18:31 -07:00
Evghenii	8391d05697	added blockIndex computations	2013-10-28 10:18:30 +01:00
Evghenii	ac095dbf3e	working on nvptx	2013-10-26 16:12:33 +02:00
james.brodman	899f85ce9c	Initial Support for new stdlib shift operator	2013-10-22 18:06:54 -04:00
egaburov	7e9b4c0924	added avx2-i64x4 and avx1.1-i64x4 targets	2013-10-15 10:02:10 +02:00
egaburov	5d56d29240	merged with master	2013-10-08 19:13:30 +02:00
Evghenii	9861375f0c	renamed avx-i64x4 -> avx1-i64x4	2013-09-13 15:07:14 +02:00
egaburov	7364e06387	added mask64	2013-09-12 12:02:42 +02:00
egaburov	9cf8e8cbf3	builtins fix for double precision svml and __stdlib_asin	2013-09-11 15:23:45 +02:00
egaburov	320c41ffcf	added svml support. experimental. for some reason all sybmols are visible..	2013-09-11 15:16:50 +02:00
egaburov	9c79d4d182	addded avxh with vectorWidth=4 support, use --target=avxh to enable it	2013-09-11 12:58:02 +02:00
james.brodman	8db378b265	Revert "Remove support for using SVML for math lib routines." This reverts commit `d9c38b5c1f`.	2013-09-04 16:01:58 -04:00
Matt Pharr	cd9afe946c	Merge branch 'master' into arm Conflicts: Makefile builtins.cpp ispc.cpp ispc.h ispc.vcxproj opt.cpp	2013-08-06 17:39:21 -07:00
Matt Pharr	1276ea9844	Revert "Remove support for building with LLVM 3.1" This reverts commit `d3c567503b`. Conflicts: opt.cpp	2013-08-06 17:00:35 -07:00
Dmitry Babokin	dff7735af9	Fix for Windows build and making NEON target optional	2013-08-02 19:24:34 -07:00
Matt Pharr	d9c38b5c1f	Remove support for using SVML for math lib routines. This path was poorly maintained and wasn't actually available on most targets.	2013-07-31 06:56:48 -07:00
Matt Pharr	d3c567503b	Remove support for building with LLVM 3.1	2013-07-31 06:46:45 -07:00
Matt Pharr	48ff03112f	Remove __pause from stdlib_core() in utils.m4. It wasn't ever being used, and was breaking compilation on ARM.	2013-07-30 08:44:22 -07:00
Matt Pharr	ab3b633733	Add 8-bit and 16-bit specialized NEON targets. Like SSE4-8 and SSE4-16, these use 8-bit and 16-bit values for mask elements, respectively, and thus should generate the best code when used for computation with datatypes of those sizes.	2013-07-30 08:44:16 -07:00
egaburov	67b549a937	Added nvptx64 target. Things to do: 1. builtins/target-nvptx64.ll to write, now it is just a copy of target-generic-1.ll 2. add __global__ & __device__ scope 2. make code work for a single cuda thread 3. use tasks to work as a block grid and programIndex as laneIdx, programCount as warpSize 4. ... and more...	2013-07-28 14:31:43 +02:00
Matt Pharr	b6df447b55	Add reduce_add() for int8 and int16 types. This maps to specialized instructions (e.g. PSADBW) when available.	2013-07-25 09:46:01 -07:00

1 2 3 4

185 Commits