Evghenii
690a8acb30
merged with master
2014-02-20 15:22:09 +01:00
Ilia Filippov
e7b3a1c822
fix for fluky problem 'argument out of range'
2014-02-13 16:47:33 +04:00
Evghenii
c8e92feb14
added additional optimizaotion passes for PTX target
2014-02-06 10:11:58 +01:00
Evghenii
7d0aa7a336
added shift
2014-01-22 20:43:53 +01:00
Evghenii
bc99897fbb
+fixed some example, found some bugs, and bugs in ptxas/cuda
2014-01-21 14:51:27 +01:00
Evghenii
63d3ac6679
Merge branch 'master' into nvptx
2014-01-20 13:47:24 +01:00
Ilia Filippov
5fa8bd3c78
changes for support LLVM trunk
2014-01-15 14:17:35 +04:00
Evghenii
9389b6e3ef
basic optimization path fails
2014-01-10 06:34:44 +01:00
evghenii
9053eed4b4
added basic optimization pass that promotes uniform into varying variables (not array) for nvptx target
2014-01-10 06:32:57 +01:00
Evghenii
0a66f17897
experimental support for non-constant [non-static] uniform arrays mapped to addrspace(3)
2014-01-08 11:06:14 +01:00
Ilia Filippov
4ef38e1615
Adding some optimization passes between two Alias Analysis passes
2013-12-27 19:22:19 +04:00
Dmitry Babokin
5cfd773ec9
Adding Alias Analysis phases
2013-12-26 10:54:05 +04:00
Dmitry Babokin
949984db18
Don't do sext+and optimization for generic targets
2013-12-23 16:31:33 +04:00
Ilia Filippov
b5dc78b06e
adding support of shl instruction in lExtractConstantOffset optimization
2013-12-16 16:01:14 +04:00
Dmitry Babokin
2d2d14744b
Fixing --opt=force-aligned-memory for LLVM 3.3+
2013-12-04 19:00:02 +04:00
Dmitry Babokin
be813ea0a2
Select optimization for LLVM 3.3
2013-11-28 21:43:05 +04:00
Ilia Filippov
3fd9d5a025
support of LLVM 3.5
2013-11-21 19:09:43 +04:00
james.brodman
0f7050d3aa
More stds compliant. VS doesn't like non constant length local arrays.
2013-10-31 19:51:13 -04:00
james.brodman
85eb4cf0d6
Fix logic that looks for shift builtins.
2013-10-29 14:02:32 -04:00
james.brodman
8ee3178166
Add Performance Warning
2013-10-28 16:51:02 -04:00
james.brodman
09a6e37154
Source cleanup.
2013-10-28 16:37:33 -04:00
james.brodman
1b8e745ffe
remove condition. Don't use gcc 4.7 for tests.
2013-10-28 16:36:59 -04:00
james.brodman
9ba7b96825
Make the new optimization play nicely with the other.s
2013-10-28 16:14:31 -04:00
james.brodman
d2b89e0e37
Tweak generic target.
2013-10-23 18:01:01 -04:00
james.brodman
4d289b16c2
Redesign after being hit with the KISS bat.
2013-10-23 14:25:43 -04:00
james.brodman
899f85ce9c
Initial Support for new stdlib shift operator
2013-10-22 18:06:54 -04:00
Matt Pharr
7ab4c5391c
Fix build with LLVM 3.2 and generic-4 / examples/sse4.h target.
2013-08-09 19:56:43 -07:00
Matt Pharr
1d76f74b16
Fix compiler warnings
2013-08-07 12:53:39 -07:00
Matt Pharr
5e5d42b918
Fix build with LLVM 3.1
2013-08-06 17:55:37 -07:00
Matt Pharr
cd9afe946c
Merge branch 'master' into arm
...
Conflicts:
Makefile
builtins.cpp
ispc.cpp
ispc.h
ispc.vcxproj
opt.cpp
2013-08-06 17:39:21 -07:00
Matt Pharr
1276ea9844
Revert "Remove support for building with LLVM 3.1"
...
This reverts commit d3c567503b .
Conflicts:
opt.cpp
2013-08-06 17:00:35 -07:00
Matt Pharr
ccdbddd388
Add peephole optimization to match int8/int16 averages.
...
Match the following patterns in IR, turning them into target-specific
intrinsics (e.g. PAVGB on x86) when possible.
(unsigned int8)(((unsigned int16)a + (unsigned int16)b + 1)/2)
(unsigned int8)(((unsigned int16)a + (unsigned int16)b)/2)
(unsigned int16)(((unsigned int32)a + (unsigned int32)b + 1)/2)
(unsigned int16)(((unsigned int32)a + (unsigned int32)b)/2)
(int8)(((int16)a + (int16)b + 1)/2)
(int8)(((int16)a + (int16)b)/2)
(int16)(((int32)a + (int32)b + 1)/2)
(int16)(((int32)a + (int32)b)/2)
2013-08-06 08:59:46 -07:00
Matt Pharr
5b20b06bd9
Add avg_{up,down}_int{8,16} routines to stdlib
...
These compute the average of two given values, rounding up and down,
respectively, if the result isn't exact. When possible, these are
mapped to target-specific intrinsics (PADD[BW] on IA and VH[R]ADD[US]
on NEON.)
A subsequent commit will add pattern-matching to generate calls to
these intrinsincs when the corresponding patterns are detected in the
IR.)
2013-08-06 08:41:12 -07:00
Ilia Filippov
a174a90f86
Supporting dumping, switching off and debug printing of optimization phases
2013-08-01 11:37:52 +04:00
Matt Pharr
d3c567503b
Remove support for building with LLVM 3.1
2013-07-31 06:46:45 -07:00
Matt Pharr
bba84f247c
Improved optimization of vector select instructions.
...
Various LLVM optimization passes are turning code like:
%cmp = icmp lt <8 x i32> %foo, %bar
%cmp32 = sext <8 x i1> %cmp to <8 x i32>
. . .
%cmp1 = trunc <8 x i32> %cmp32 to <8 x i1>
%result = select <8 x i1> %cmp1, . . .
Into:
%cmp = icmp lt <8 x i32> %foo, %bar
%cmp32 = zext <8 x i1> %cmp to <8 x i32> # note: zext
. . .
%cmp1 = icmp ne <8 x i32> %cmp32, zeroinitializer
%result = select <8 x i1> %cmp1, …
Which in turn isn't matched well by the LLVM code generators, which
in turn leads to fairly inefficient code. (i.e. it doesn't just emit
a vector compare and blend instruction.)
Also, renamed VSelMovmskOptPass to InstructionSimplifyPass to better
describe its functionality.
2013-07-25 09:46:01 -07:00
Matt Pharr
53414f12e6
Add SSE4 target optimized for computation with 8-bit datatypes.
...
This change adds a new 'sse4-8' target, where programCount is 16 and
the mask element size is 8-bits. (i.e. the most appropriate sizing of
the mask for SIMD computation with 8-bit datatypes.)
2013-07-23 17:30:32 -07:00
Dmitry Babokin
fdcec5a219
Tracking LLVM trunk: removing llvm::createSimplifyLibCallsPass() call
2013-06-24 10:08:06 +04:00
Ilia Filippov
d92f9df17c
changes in function LLVMFlattenInsertChain
2013-06-14 15:21:45 +04:00
Dmitry Babokin
eb2e5f378c
Comment fixes
2013-04-18 15:36:35 +04:00
Dmitry Babokin
cb650d6100
One more opportunity to do better broadcast
2013-04-17 20:56:32 +04:00
Dmitry Babokin
5898532605
Broadcast implementation as InsertElement+Shuffle and related improvements
2013-04-10 02:18:24 +04:00
james.brodman
0a3822f2e5
Fix to make sure we're generating 32-bit gather/scatter when force32bitaddressing is set.
2013-04-05 16:21:05 -04:00
Dmitry Babokin
0af2a13349
DataLayout is changed to be managed from single place. v4-128-128 is added to generic DataLayout
2013-03-23 14:38:51 +04:00
Dmitry Babokin
0f86255279
Target class redesign: data moved to private. Also empty target-feature attribute is not added anymore (generic targets).
2013-03-23 14:28:05 +04:00
Dmitry Babokin
3f8a678c5a
Editorial change: fixing trailing white spaces and tabs
2013-03-18 16:17:55 +04:00
james.brodman
3aaf2ef2d4
ToT Fixes / M4 macro fix
2013-01-14 14:55:10 -05:00
Matt Pharr
0bf1320a32
Remove support for building with LLVM 3.0
2013-01-06 12:27:53 -08:00
Matt Pharr
81dbd504aa
Small fixes to eliminate compiler warnings when using clang
2013-01-06 12:10:54 -08:00
Matt Pharr
63dd7d9859
Fix build to work with LLVM top-of-tree again
2013-01-06 12:02:08 -08:00