Matt Pharr
40a295e951
Fix bug where "avx-x2" target would cause AVX1.1 to be used.
2012-06-12 13:37:38 -07:00
Matt Pharr
d6c6f95373
Do all replacements of __pseudo* memory ops in a single optimization pass.
...
Collected the old PseudoGSToGSPass and PseudoMaskedStorePass into a single
pass, ReplacePseudoMemoryOpsPass, which handles both of their tasks.
2012-06-12 13:10:03 -07:00
Matt Pharr
19b46be20d
Remove load_and_broadcast from built-ins.
...
Now that we never ever run with the mask all off, we no longer need
that logic in a built-in function so that we can check the mask. In
the one place where it was used (turning gathers to the same location
into a load and broadcast), we now just emit the code for that
directly.
2012-06-12 12:30:57 -07:00
Ingo Wald
789e04ce90
Add support for host/device stub functions for offload.
2012-06-12 10:23:49 -07:00
Matt Pharr
dd4f0a600b
Update AVX1.1 targets to not include declarations of half/float routines in bit code.
2012-06-08 15:57:36 -07:00
Matt Pharr
6c7df4cb6b
Add initial support for "avx1.1" targets for Ivy Bridge.
...
So far, only the use of the float/half conversion instructions distinguishes
this from the "avx1" target.
Partial work on issue #263 .
2012-06-08 15:55:00 -07:00
Matt Pharr
79e0a9f32a
Fix codegen bug with foreach_tiled.
...
When the outermost dimension(s) were partially active, but the innermost
dimension was all on, we'd inadvertently use an incorrect "all on"
execution mask.
Fixes issues #177 and #200 .
2012-06-08 14:56:18 -07:00
Matt Pharr
6c9bc63a1c
Improve SourcePos reporting of the origin of the gather for gather warnings.
2012-06-08 13:33:11 -07:00
Matt Pharr
28a821df7d
Improve wording of gather/scatter performance warnings.
2012-06-08 13:32:57 -07:00
Matt Pharr
27e39954d6
Fix a number of issues in examples/intrinsics/sse4.h.
...
This had gotten fairly out of date, after recent changes to C++ output.
Roughly 15 tests still fail with this target.
Issue #278 .
2012-06-08 12:52:36 -07:00
Matt Pharr
e730a5364b
Issue error if any complex assignment operator is used with a struct type.
...
Issue #275 .
2012-06-08 11:29:02 -07:00
Matt Pharr
92b3ae41dd
Don't print request to file bug on fatal error twice.
2012-06-08 11:23:45 -07:00
Matt Pharr
89a2566e01
Add separate variants of memory built-ins for floats and doubles.
...
Previously, we'd bitcast e.g. a vector of floats to a vector of i32s and then
use the i32 variant of masked_load/masked_store/gather/scatter. Now, we have
separate float/double variants of each of those.
2012-06-07 14:47:16 -07:00
Matt Pharr
1ac3e03171
Gather/scatter function improvements in builtins.
...
More naming consistency: _i32 rather than i32, now.
Also improved the m4 macros to generate these sequences to not require as
many parameters.
2012-06-07 14:19:23 -07:00
Matt Pharr
b86d40091a
Improve naming of masked load/store instructions in builtins.
...
Now, use _i32 suffixes, rather than _32, etc. Also cleaned up the m4
macro to generate these functions, using WIDTH to get the target width,
etc.
2012-06-07 13:58:31 -07:00
Matt Pharr
91d22d150f
Update load_and_broadcast built-in
...
Change function suffix to "_i32", etc, from "_32"
Improve load_and_broadcast macro in util.m4 to grab vector width from
WIDTH variable rather than taking it as a parameter.
2012-06-07 13:33:17 -07:00
Matt Pharr
1d29991268
Indentation fixes in builtins/
2012-06-07 13:23:07 -07:00
Matt Pharr
6f0a2686dc
Use %a format for printf() for float constants on non-Windows platforms.
2012-06-07 13:20:03 -07:00
Matt Pharr
f06caabb07
Generate better code for break statements in varying loops (sometimes).
...
If we have a simple varying 'if' statement where the only code in the body is
a single 'break', then emit special case code that just updates the execution
mask directly.
Surprisingly, this leads to better generated code (e.g. Mandelbrot 7.1x on AVX
vs 5.8x before). It's not clear why the general code generation path for
break doesn't generate the equivalent code; this topic should be investigated
further. (Issue #277 ).
2012-06-06 11:08:42 -07:00
Matt Pharr
3c869802fb
Always store multiply-used vector compares in temporary variables (C++ output).
2012-06-06 11:08:42 -07:00
Matt Pharr
7b6bd90903
Remove various equality checks between GetInternalMask() and LLVMMaskAllOn
...
These were never kicking in, since GetInternalMask() always loads from the
mask storage memory.
2012-06-06 11:08:42 -07:00
Matt Pharr
967bfa9c92
Silence compiler warning.
2012-06-06 08:08:55 -07:00
Matt Pharr
592affb984
Add experimental (and undocumented for now) export syntax.
...
This allows adding types to the list that are included in the automatically-generated
header files.
struct Foo { . . . };
struct Bar { . . . };
export { Foo, Bar };
2012-06-05 12:51:21 -07:00
Matt Pharr
96aaf6d53b
Fix build with LLVM top of tree.
2012-06-05 12:28:05 -07:00
Matt Pharr
1397dbdabc
Don't generate colorized output escapes when stderr isn't a TTY.
...
When piping to a pile, more/less, etc, this is generally undesirable.
This behavior can be overridden with the --colorized-output command-line
flag.
2012-06-04 09:20:57 -07:00
Matt Pharr
6118643232
Handle more error cases if the user tries to declare a method.
2012-06-04 09:07:13 -07:00
Matt Pharr
71198a0b54
Don't indent too much in errors/warnings if the filename is long.
2012-06-04 08:53:43 -07:00
Matt Pharr
22cb80399f
Issue error if user tries to declare a method.
2012-06-04 08:50:13 -07:00
Matt Pharr
6df7d31a5b
Fix incorrect assertion.
...
Issue #272 .
2012-05-30 16:34:59 -07:00
Matt Pharr
ef049e92ef
Handle undefined struct types when generating headers.
2012-05-30 16:28:21 -07:00
Matt Pharr
fe8b109ca5
Fix more tests for 32 and 64-wide execution.
2012-05-30 13:06:07 -07:00
Matt Pharr
8fd9b84a80
Update seed_rng() in stdlib to take a varying seed.
...
Previously, we were trying to take a uniform seed and then shuffle that
around to initialize the state for each of the program instances. This
was becoming increasingly untenable and brittle.
Now a varying seed is expected and used.
2012-05-30 10:35:41 -07:00
Matt Pharr
5cb53f52c3
Fix various tests/[frs]* files to be correct with 32 and 64-wide targets.
...
Still todo: tests/c*, tests/test-*
2012-05-30 10:31:12 -07:00
Matt Pharr
d86653668e
Fix a number of tests to work correctly with 32/64-wide targets.
...
Still to be reviewed/fixed: tests/test-*, tests/[cfrs]*
2012-05-29 10:16:43 -07:00
Matt Pharr
5084712a15
Fix bugs in examples/intrinsics/generic-64.h
...
There were a number of situations where we were left-shifting 1 by a
lane index that were failing due to shifting beyond 32-bits. Fixed
by shifting the 64-bit constant value 1ull.
2012-05-29 08:31:10 -07:00
Jean-Luc Duprat
ece65cab18
Fix some tests for up to 64-wide gangs
2012-05-29 07:52:50 -07:00
Matt Pharr
1f6075506c
Fix linux build (Jean-Luc Duprat)
2012-05-28 19:45:16 -07:00
Matt Pharr
51ade48e3d
Fix some of the reduce-* tests for 32 and 64-wide targets
2012-05-25 14:47:06 -07:00
Matt Pharr
21c43737fe
Fix bug in examples/intrinsics/generic-32.h
2012-05-25 14:27:30 -07:00
Matt Pharr
6c7bcf00e7
Add examples/intrinsics/generic-64.h.
2012-05-25 14:27:19 -07:00
Matt Pharr
7a2142075c
Add examples/intrinsics/generic-32.h implementation.
...
Roughly 100 tests fail with this; all the tests need to be audited
for assumptions that 16 is the widest width possible…
2012-05-25 12:37:59 -07:00
Matt Pharr
e8e9baa417
Update test_static.cpp to handle up to 64-wide
2012-05-25 12:14:58 -07:00
Matt Pharr
449d956966
Add support for generic-64 target.
2012-05-25 11:57:28 -07:00
Matt Pharr
90db01d038
Represent MOVMSK'ed masks with int64s rather than int32s.
...
This allows us to scale up to 64-wide execution.
2012-05-25 11:57:23 -07:00
Matt Pharr
38cea6dc71
Issue error if "typedef" is inadvertently included in function definition.
...
Issue #267 .
2012-05-25 11:09:26 -07:00
Matt Pharr
64807dfb3b
Add AssertPos() macro that provides rough source location in error
...
It can sometimes be useful to know the general place we were in the program
when an assertion hit; when the position is available / applicable, this
macro is now used.
Issue #268 .
2012-05-25 10:59:45 -07:00
Matt Pharr
d943455e10
Issue error on overloaded "export"ed functions.
...
Issue #270 .
2012-05-25 10:35:34 -07:00
Matt Pharr
fd03ba7586
Export reference parameters as C++ references, not pointers.
2012-05-24 07:12:48 -07:00
Matt Pharr
2c5a57e386
Fix bugs related to varying pointers to functions that return void.
2012-05-23 14:29:17 -07:00
Matt Pharr
e8858150cb
Allow redundant semicolons at global scope. (Ingo Wald)
2012-05-23 14:20:20 -07:00