Commit Graph

192 Commits

Author SHA1 Message Date
evghenii
69f3898a61 Merge branch 'master' into nvptx_merge 2014-07-07 16:30:12 +02:00
Dmitry Babokin
dcc37451e5 Removing alias phases causing segfaults 2014-04-17 23:52:32 +04:00
Evghenii
4c837d94a9 raplce @llvm.trap into call void asm"trap;",""() 2014-03-19 16:35:14 +01:00
Evghenii
4641a15287 Merge branch 'master' into nvptx 2014-03-19 10:53:07 +01:00
Dmitry Babokin
31b95b665b Copyright update 2014-03-12 20:19:16 +04:00
Ilia Filippov
47f7900cd3 support LLVM trunk 2014-03-07 16:28:56 +04:00
Ilia Filippov
9ab8f4e10e support LLVM trunk after 202814-202842 revisions 2014-03-05 10:12:30 +04:00
Ilia Filippov
06c06456c4 support LLVM trunk after r202168 r202190 revisions 2014-02-26 17:06:58 +04:00
Evghenii
42c4d3246c Merge branch 'master' into nvptx_clean 2014-02-21 12:45:01 +01:00
Ilia Filippov
42e00ebb24 adding cases of 'cast' instructions in optimizations 2014-02-21 13:00:16 +04:00
Evghenii
690a8acb30 merged with master 2014-02-20 15:22:09 +01:00
Ilia Filippov
e7b3a1c822 fix for fluky problem 'argument out of range' 2014-02-13 16:47:33 +04:00
Evghenii
c8e92feb14 added additional optimizaotion passes for PTX target 2014-02-06 10:11:58 +01:00
Evghenii
7d0aa7a336 added shift 2014-01-22 20:43:53 +01:00
Evghenii
bc99897fbb +fixed some example, found some bugs, and bugs in ptxas/cuda 2014-01-21 14:51:27 +01:00
Evghenii
63d3ac6679 Merge branch 'master' into nvptx 2014-01-20 13:47:24 +01:00
Ilia Filippov
5fa8bd3c78 changes for support LLVM trunk 2014-01-15 14:17:35 +04:00
Evghenii
9389b6e3ef basic optimization path fails 2014-01-10 06:34:44 +01:00
evghenii
9053eed4b4 added basic optimization pass that promotes uniform into varying variables (not array) for nvptx target 2014-01-10 06:32:57 +01:00
Evghenii
0a66f17897 experimental support for non-constant [non-static] uniform arrays mapped to addrspace(3) 2014-01-08 11:06:14 +01:00
Ilia Filippov
4ef38e1615 Adding some optimization passes between two Alias Analysis passes 2013-12-27 19:22:19 +04:00
Dmitry Babokin
5cfd773ec9 Adding Alias Analysis phases 2013-12-26 10:54:05 +04:00
Dmitry Babokin
949984db18 Don't do sext+and optimization for generic targets 2013-12-23 16:31:33 +04:00
Ilia Filippov
b5dc78b06e adding support of shl instruction in lExtractConstantOffset optimization 2013-12-16 16:01:14 +04:00
Dmitry Babokin
2d2d14744b Fixing --opt=force-aligned-memory for LLVM 3.3+ 2013-12-04 19:00:02 +04:00
Dmitry Babokin
be813ea0a2 Select optimization for LLVM 3.3 2013-11-28 21:43:05 +04:00
Ilia Filippov
3fd9d5a025 support of LLVM 3.5 2013-11-21 19:09:43 +04:00
james.brodman
0f7050d3aa More stds compliant. VS doesn't like non constant length local arrays. 2013-10-31 19:51:13 -04:00
james.brodman
85eb4cf0d6 Fix logic that looks for shift builtins. 2013-10-29 14:02:32 -04:00
james.brodman
8ee3178166 Add Performance Warning 2013-10-28 16:51:02 -04:00
james.brodman
09a6e37154 Source cleanup. 2013-10-28 16:37:33 -04:00
james.brodman
1b8e745ffe remove condition. Don't use gcc 4.7 for tests. 2013-10-28 16:36:59 -04:00
james.brodman
9ba7b96825 Make the new optimization play nicely with the other.s 2013-10-28 16:14:31 -04:00
james.brodman
d2b89e0e37 Tweak generic target. 2013-10-23 18:01:01 -04:00
james.brodman
4d289b16c2 Redesign after being hit with the KISS bat. 2013-10-23 14:25:43 -04:00
james.brodman
899f85ce9c Initial Support for new stdlib shift operator 2013-10-22 18:06:54 -04:00
Matt Pharr
7ab4c5391c Fix build with LLVM 3.2 and generic-4 / examples/sse4.h target. 2013-08-09 19:56:43 -07:00
Matt Pharr
1d76f74b16 Fix compiler warnings 2013-08-07 12:53:39 -07:00
Matt Pharr
5e5d42b918 Fix build with LLVM 3.1 2013-08-06 17:55:37 -07:00
Matt Pharr
cd9afe946c Merge branch 'master' into arm
Conflicts:
	Makefile
	builtins.cpp
	ispc.cpp
	ispc.h
	ispc.vcxproj
	opt.cpp
2013-08-06 17:39:21 -07:00
Matt Pharr
1276ea9844 Revert "Remove support for building with LLVM 3.1"
This reverts commit d3c567503b.

Conflicts:
	opt.cpp
2013-08-06 17:00:35 -07:00
Matt Pharr
ccdbddd388 Add peephole optimization to match int8/int16 averages.
Match the following patterns in IR, turning them into target-specific
intrinsics (e.g. PAVGB on x86) when possible.

(unsigned int8)(((unsigned int16)a + (unsigned int16)b + 1)/2)
(unsigned int8)(((unsigned int16)a + (unsigned int16)b)/2)
(unsigned int16)(((unsigned int32)a + (unsigned int32)b + 1)/2)
(unsigned int16)(((unsigned int32)a + (unsigned int32)b)/2)
(int8)(((int16)a + (int16)b + 1)/2)
(int8)(((int16)a + (int16)b)/2)
(int16)(((int32)a + (int32)b + 1)/2)
(int16)(((int32)a + (int32)b)/2)
2013-08-06 08:59:46 -07:00
Matt Pharr
5b20b06bd9 Add avg_{up,down}_int{8,16} routines to stdlib
These compute the average of two given values, rounding up and down,
respectively, if the result isn't exact.  When possible, these are
mapped to target-specific intrinsics (PADD[BW] on IA and VH[R]ADD[US]
on NEON.)

A subsequent commit will add pattern-matching to generate calls to
these intrinsincs when the corresponding patterns are detected in the
IR.)
2013-08-06 08:41:12 -07:00
Ilia Filippov
a174a90f86 Supporting dumping, switching off and debug printing of optimization phases 2013-08-01 11:37:52 +04:00
Matt Pharr
d3c567503b Remove support for building with LLVM 3.1 2013-07-31 06:46:45 -07:00
Matt Pharr
bba84f247c Improved optimization of vector select instructions.
Various LLVM optimization passes are turning code like:

%cmp = icmp lt <8 x i32> %foo, %bar
%cmp32 = sext <8 x i1> %cmp to <8 x i32>
. . .
%cmp1 = trunc <8 x i32> %cmp32 to <8 x i1>
%result = select <8 x i1> %cmp1, . . .

Into:

%cmp = icmp lt <8 x i32> %foo, %bar
%cmp32 = zext <8 x i1> %cmp to <8 x i32>   # note: zext
. . .
%cmp1 = icmp ne <8 x i32> %cmp32, zeroinitializer
%result = select <8 x i1> %cmp1, …

Which in turn isn't matched well by the LLVM code generators, which
in turn leads to fairly inefficient code.  (i.e. it doesn't just emit
a vector compare and blend instruction.)

Also, renamed VSelMovmskOptPass to InstructionSimplifyPass to better
describe its functionality.
2013-07-25 09:46:01 -07:00
Matt Pharr
53414f12e6 Add SSE4 target optimized for computation with 8-bit datatypes.
This change adds a new 'sse4-8' target, where programCount is 16 and
the mask element size is 8-bits.  (i.e. the most appropriate sizing of
the mask for SIMD computation with 8-bit datatypes.)
2013-07-23 17:30:32 -07:00
Dmitry Babokin
fdcec5a219 Tracking LLVM trunk: removing llvm::createSimplifyLibCallsPass() call 2013-06-24 10:08:06 +04:00
Ilia Filippov
d92f9df17c changes in function LLVMFlattenInsertChain 2013-06-14 15:21:45 +04:00
Dmitry Babokin
eb2e5f378c Comment fixes 2013-04-18 15:36:35 +04:00