aaron/ispc - ispc - git.frat.tech

aaron/ispc

Author	SHA1	Message	Date
Evghenii	7d0aa7a336	added shift	2014-01-22 20:43:53 +01:00
Evghenii	bc99897fbb	+fixed some example, found some bugs, and bugs in ptxas/cuda	2014-01-21 14:51:27 +01:00
Evghenii	63d3ac6679	Merge branch 'master' into nvptx	2014-01-20 13:47:24 +01:00
Ilia Filippov	5fa8bd3c78	changes for support LLVM trunk	2014-01-15 14:17:35 +04:00
Evghenii	9389b6e3ef	basic optimization path fails	2014-01-10 06:34:44 +01:00
evghenii	9053eed4b4	added basic optimization pass that promotes uniform into varying variables (not array) for nvptx target	2014-01-10 06:32:57 +01:00
Evghenii	0a66f17897	experimental support for non-constant [non-static] uniform arrays mapped to addrspace(3)	2014-01-08 11:06:14 +01:00
Ilia Filippov	4ef38e1615	Adding some optimization passes between two Alias Analysis passes	2013-12-27 19:22:19 +04:00
Dmitry Babokin	5cfd773ec9	Adding Alias Analysis phases	2013-12-26 10:54:05 +04:00
Dmitry Babokin	949984db18	Don't do sext+and optimization for generic targets	2013-12-23 16:31:33 +04:00
Ilia Filippov	b5dc78b06e	adding support of shl instruction in lExtractConstantOffset optimization	2013-12-16 16:01:14 +04:00
Dmitry Babokin	2d2d14744b	Fixing --opt=force-aligned-memory for LLVM 3.3+	2013-12-04 19:00:02 +04:00
Dmitry Babokin	be813ea0a2	Select optimization for LLVM 3.3	2013-11-28 21:43:05 +04:00
Ilia Filippov	3fd9d5a025	support of LLVM 3.5	2013-11-21 19:09:43 +04:00
james.brodman	0f7050d3aa	More stds compliant. VS doesn't like non constant length local arrays.	2013-10-31 19:51:13 -04:00
james.brodman	85eb4cf0d6	Fix logic that looks for shift builtins.	2013-10-29 14:02:32 -04:00
james.brodman	8ee3178166	Add Performance Warning	2013-10-28 16:51:02 -04:00
james.brodman	09a6e37154	Source cleanup.	2013-10-28 16:37:33 -04:00
james.brodman	1b8e745ffe	remove condition. Don't use gcc 4.7 for tests.	2013-10-28 16:36:59 -04:00
james.brodman	9ba7b96825	Make the new optimization play nicely with the other.s	2013-10-28 16:14:31 -04:00
james.brodman	d2b89e0e37	Tweak generic target.	2013-10-23 18:01:01 -04:00
james.brodman	4d289b16c2	Redesign after being hit with the KISS bat.	2013-10-23 14:25:43 -04:00
james.brodman	899f85ce9c	Initial Support for new stdlib shift operator	2013-10-22 18:06:54 -04:00
Matt Pharr	7ab4c5391c	Fix build with LLVM 3.2 and generic-4 / examples/sse4.h target.	2013-08-09 19:56:43 -07:00
Matt Pharr	1d76f74b16	Fix compiler warnings	2013-08-07 12:53:39 -07:00
Matt Pharr	5e5d42b918	Fix build with LLVM 3.1	2013-08-06 17:55:37 -07:00
Matt Pharr	cd9afe946c	Merge branch 'master' into arm Conflicts: Makefile builtins.cpp ispc.cpp ispc.h ispc.vcxproj opt.cpp	2013-08-06 17:39:21 -07:00
Matt Pharr	1276ea9844	Revert "Remove support for building with LLVM 3.1" This reverts commit `d3c567503b`. Conflicts: opt.cpp	2013-08-06 17:00:35 -07:00
Matt Pharr	ccdbddd388	Add peephole optimization to match int8/int16 averages. Match the following patterns in IR, turning them into target-specific intrinsics (e.g. PAVGB on x86) when possible. (unsigned int8)(((unsigned int16)a + (unsigned int16)b + 1)/2) (unsigned int8)(((unsigned int16)a + (unsigned int16)b)/2) (unsigned int16)(((unsigned int32)a + (unsigned int32)b + 1)/2) (unsigned int16)(((unsigned int32)a + (unsigned int32)b)/2) (int8)(((int16)a + (int16)b + 1)/2) (int8)(((int16)a + (int16)b)/2) (int16)(((int32)a + (int32)b + 1)/2) (int16)(((int32)a + (int32)b)/2)	2013-08-06 08:59:46 -07:00
Matt Pharr	5b20b06bd9	Add avg_{up,down}_int{8,16} routines to stdlib These compute the average of two given values, rounding up and down, respectively, if the result isn't exact. When possible, these are mapped to target-specific intrinsics (PADD[BW] on IA and VH[R]ADD[US] on NEON.) A subsequent commit will add pattern-matching to generate calls to these intrinsincs when the corresponding patterns are detected in the IR.)	2013-08-06 08:41:12 -07:00
Ilia Filippov	a174a90f86	Supporting dumping, switching off and debug printing of optimization phases	2013-08-01 11:37:52 +04:00
Matt Pharr	d3c567503b	Remove support for building with LLVM 3.1	2013-07-31 06:46:45 -07:00
Matt Pharr	bba84f247c	Improved optimization of vector select instructions. Various LLVM optimization passes are turning code like: %cmp = icmp lt <8 x i32> %foo, %bar %cmp32 = sext <8 x i1> %cmp to <8 x i32> . . . %cmp1 = trunc <8 x i32> %cmp32 to <8 x i1> %result = select <8 x i1> %cmp1, . . . Into: %cmp = icmp lt <8 x i32> %foo, %bar %cmp32 = zext <8 x i1> %cmp to <8 x i32> # note: zext . . . %cmp1 = icmp ne <8 x i32> %cmp32, zeroinitializer %result = select <8 x i1> %cmp1, … Which in turn isn't matched well by the LLVM code generators, which in turn leads to fairly inefficient code. (i.e. it doesn't just emit a vector compare and blend instruction.) Also, renamed VSelMovmskOptPass to InstructionSimplifyPass to better describe its functionality.	2013-07-25 09:46:01 -07:00
Matt Pharr	53414f12e6	Add SSE4 target optimized for computation with 8-bit datatypes. This change adds a new 'sse4-8' target, where programCount is 16 and the mask element size is 8-bits. (i.e. the most appropriate sizing of the mask for SIMD computation with 8-bit datatypes.)	2013-07-23 17:30:32 -07:00
Dmitry Babokin	fdcec5a219	Tracking LLVM trunk: removing llvm::createSimplifyLibCallsPass() call	2013-06-24 10:08:06 +04:00
Ilia Filippov	d92f9df17c	changes in function LLVMFlattenInsertChain	2013-06-14 15:21:45 +04:00
Dmitry Babokin	eb2e5f378c	Comment fixes	2013-04-18 15:36:35 +04:00
Dmitry Babokin	cb650d6100	One more opportunity to do better broadcast	2013-04-17 20:56:32 +04:00
Dmitry Babokin	5898532605	Broadcast implementation as InsertElement+Shuffle and related improvements	2013-04-10 02:18:24 +04:00
james.brodman	0a3822f2e5	Fix to make sure we're generating 32-bit gather/scatter when force32bitaddressing is set.	2013-04-05 16:21:05 -04:00
Dmitry Babokin	0af2a13349	DataLayout is changed to be managed from single place. v4-128-128 is added to generic DataLayout	2013-03-23 14:38:51 +04:00
Dmitry Babokin	0f86255279	Target class redesign: data moved to private. Also empty target-feature attribute is not added anymore (generic targets).	2013-03-23 14:28:05 +04:00
Dmitry Babokin	3f8a678c5a	Editorial change: fixing trailing white spaces and tabs	2013-03-18 16:17:55 +04:00
james.brodman	3aaf2ef2d4	ToT Fixes / M4 macro fix	2013-01-14 14:55:10 -05:00
Matt Pharr	0bf1320a32	Remove support for building with LLVM 3.0	2013-01-06 12:27:53 -08:00
Matt Pharr	81dbd504aa	Small fixes to eliminate compiler warnings when using clang	2013-01-06 12:10:54 -08:00
Matt Pharr	63dd7d9859	Fix build to work with LLVM top-of-tree again	2013-01-06 12:02:08 -08:00
ptu1	810784da1f	Set the ScalarReplAggregate maximum structure size based on target vector width.	2012-11-13 12:35:45 -08:00
Matt Pharr	172a189c6f	Fix build with LLVM top-of-tree	2012-10-17 11:11:50 -07:00
Matt Pharr	be2108260e	Add --opt=force-aligned-memory option. This forces all vector loads/stores to be done assuming that the given pointer is aligned to the vector size, thus allowing the use of sometimes more-efficient instructions. (If it isn't the case that the memory is aligned, the program will fail!).	2012-09-14 13:49:45 -07:00

1 2 3 4

179 Commits