Commit Graph

937 Commits

Author SHA1 Message Date
Matt Pharr
e5fe0eabdc Update __load() builtins to take const pointers. 2012-07-06 08:47:47 -07:00
Matt Pharr
0d3993fa25 More varied support for constant vectors from C++ backend.
If we have a vector of all zeros, a __setzero_* function call is emitted,
permitting calling specialized intrinsics for this.  Undefined values
are reflected with an __undef_* call, which similarly allows passing that
information along.

This change also includes a cleanup to the signature of the __smear_*
functions; since they already have different names depending on the
scalar value type, we don't need to use the trick of passing an
undefined value of the return vector type as the first parameter as
an indirect way to overload by return value.

Issue #317.
2012-07-05 20:19:11 -07:00
Jean-Luc Duprat
ac421f68e2 Ongoing support for int64 for KNC:
Fixes to __load and __store.
Added __add, __mul, __equal, __not_equal, __extract_elements, __smear_i64, __cast_sext, __cast_zext,
and __scatter_base_offsets32_float.

__rcp_varying_float now has a fast-math and full-precision implementation.
2012-07-05 17:05:42 -07:00
Matt Pharr
6aad4c7a39 Bump version number to 1.3.1dev 2012-07-05 13:35:34 -07:00
Matt Pharr
4186ef204d Fix build with LLVM top of tree. 2012-07-05 13:35:01 -07:00
Matt Pharr
ae7a094ee0 Merge pull request #315 from NicolasT/master
Fix build on Fedora 17
2012-07-04 08:21:03 -07:00
Nicolas Trangez
3a007f939a Build: Include unistd.h where required
Some modules require an include of unistd.h (e.g. for getcwd and isatty
definitions).

These changes were required to build successfully on a Fedora 17 system,
using GCC 4.7.0 & glibc-headers 2.15.
2012-07-04 14:49:00 +02:00
Matt Pharr
b8503b9255 News and doxygen version number bump for 1.3.0 v1.3.0 2012-06-29 08:38:38 -07:00
Matt Pharr
b7bc76d3cc Documentation updates for 1.3.0. 2012-06-29 08:35:29 -07:00
Matt Pharr
27d6c12972 Bump ISPC_MINOR_VERSION to 3 2012-06-28 16:15:46 -07:00
Matt Pharr
b69d783e09 Bump version to 1.3.0 2012-06-28 15:35:52 -07:00
Matt Pharr
3b2ff6301c Use fputs() rather than puts() for printing final result from print().
puts() sillily adds an undesired newline.
2012-06-28 12:29:40 -07:00
Matt Pharr
6c7043916e Silence bogus compiler warning 2012-06-28 12:11:56 -07:00
Matt Pharr
96a6e75b71 Fix issues with LLVM 3.0 and 3.1 build in cbackend.cpp
Should fix issue #312.
2012-06-28 12:11:27 -07:00
Matt Pharr
a91e4e7981 Fix missing ;s from 66d4c2ddd9 2012-06-28 12:04:58 -07:00
Jean-Luc Duprat
95d8f76ec3 Added prelimary support for Intel's Xeon Phi KNC processor.
float, int32 and double support is included; int8, int16 and int64
not supported yet.

This is work in progress and not considered stable yet.
2012-06-28 12:00:55 -07:00
Jean-Luc Duprat
66d4c2ddd9 When the --emit-c++ option is used, the state of the --opt=fast-math option is passed into the generated C++ code.
If --opt=fast-math is used then the generated code contains:
   #define ISPC_FAST_MATH 1
Otherwise it contains:
   #undef ISPC_FAST_MATH

This allows the generic headers to support the user's request.
2012-06-28 11:17:11 -07:00
Jean-Luc Duprat
e431b07e04 Changed the C API to use templates to indicate memory alignment to the C compiler
This should help with performance of the generated code.
Updated the relevant header files (sse4.h, generic-16.h, generic-32.h, generic-64.h)

Updated generic-32.h and generic-64.h to the new memory API
2012-06-28 09:29:15 -07:00
Matt Pharr
d34a87404d Provide (undocumented for now) __pause() call to emit PAUSE inst. 2012-06-28 09:28:25 -07:00
Matt Pharr
f38770bf2a Fix build with LLVM ToT 2012-06-28 07:36:10 -07:00
Jean-Luc Duprat
2a4dff38d0 cbackend.cpp now makes explicit use of the llvm namespace
(Rather than implicitly with a using declaration.)  This will
allow for some further changes to ISPC's C backend, without collision
with ISPC's namespace. This change aims to have no effect on the code
generated by the compiler, it should be a big no-op; except for its
side-effects on maintainability.
2012-06-27 08:30:30 -07:00
Matt Pharr
f558ee788e Fix bug with generating implicit zero initializer values.
Issue #300.
2012-06-26 11:58:16 -07:00
Matt Pharr
ceb8ca680c Fix crash in codegen for assert() with malformed program.
Issue #302.
2012-06-26 11:54:55 -07:00
Matt Pharr
79ebcbec4b Fix crash in SwitchStmt::TypeCheck() with malformed programs. 2012-06-26 11:21:33 -07:00
Matt Pharr
2c7b650240 Add FAQ to explain how to launch per-instance tasks with foreach_active and unmasked.
Issue #227.
2012-06-22 14:32:05 -07:00
Matt Pharr
54459255d4 Add unmasked { } statement.
This reestablishes an "all on" execution mask for the gang, which can
be useful for nested parallelism..
2012-06-22 14:30:58 -07:00
Matt Pharr
b4a078e2f6 Add foreach_active iteration statement.
Issue #298.
2012-06-22 10:35:43 -07:00
Matt Pharr
ed13dd066b Distinguish between 'regular' foreach and foreach_unique in FunctionEmitContext
We need to do this since it's illegal to have nested foreach statements, but
nested foreach_unique, or foreach_unique inside foreach, etc., are all fine.
2012-06-22 06:04:00 -07:00
Matt Pharr
2b4a3b22bf Issue an error if the user has nested foreach statements.
Partially addresses issue #280.  (We should support them properly,
but at least now we don't silently generate incorrect code.)
2012-06-21 16:53:27 -07:00
Matt Pharr
8b891da628 Allow referring to the struct type being defined in its members.
It's now legal to write:

struct Foo { Foo *next; };

previously, a predeclaration "struct Foo;" was required.  This fixes
issue #287.

This change also fixes a bug where multiple forward declarations 
"struct Foo; struct Foo;" would incorrectly issue an error on the
second one.
2012-06-21 16:44:04 -07:00
Matt Pharr
5a2c8342eb Allow structs with no members.
Issue #289.
2012-06-21 16:07:31 -07:00
Matt Pharr
50eb4bf53a Change print() implementation to accumulate string locally before printing.
The string to be printed is accumulated into a local buffer before being sent to
puts().  This ensure that if multiple threads are running and printing at the
same time, their output won't be interleaved (across individual print statements--
it still may be interleaved across different print statements, just like in C).

Issue #293.
2012-06-21 14:41:53 -07:00
Matt Pharr
3c10ddd46a Fix declaration of size_t.
It should be an unsigned integer type.
2012-06-21 14:40:24 -07:00
Matt Pharr
0b7f9acc70 Align <16 x i1> vectors to just 16 bits for generic targets.
Partially addresses issue #259.
2012-06-21 10:25:33 -07:00
Matt Pharr
10fbaec247 Fix C++ output for unordered fp compares.
Fixes a bug introduced in 46716aada3.
2012-06-21 09:57:19 -07:00
Matt Pharr
007a734595 Add support for 'unmasked' function qualifier. 2012-06-20 15:36:00 -07:00
Matt Pharr
46716aada3 Switch to unordered floating point compares.
In particular, this gives us desired behavior for NaNs (all compares
involving a NaN evaluate to true).  This in turn allows writing the
canonical isnan() function as "v != v".

Added isnan() to the standard library as well.
2012-06-20 13:25:53 -07:00
Matt Pharr
3bc66136b2 Add foreach_unique iteration construct.
Idea via Ingo Wald / IVL compiler.
2012-06-20 10:04:24 -07:00
Matt Pharr
fae47e0dfc Update stdlib to not use "in" as a variable name.
Preparation for foreach_unique, which uses that as a keyword.
2012-06-20 10:04:24 -07:00
Matt Pharr
bd52e86486 Issue error on attempt to derefernce void pointer types.
Issue #288.
2012-06-18 19:51:19 -07:00
Matt Pharr
b2f6ed7209 Fix usage of CastType 2012-06-18 16:26:31 -07:00
Matt Pharr
4b334fd2e2 Fix linkage for programIndex et al. when not debugging.
We now use InternalLinkage for the 'programIndex' symbol (and similar)
if we're not compiling with debugging symbols.  This prevents those
symbol names/definitions from polluting the global namespace for
the common case.

Basically addresses Issue #274.
2012-06-15 11:50:16 -07:00
Matt Pharr
a23a7006e3 Don't issue error incorrectly with forward decl. of exported function.
Issue #281.
2012-06-15 10:54:50 -07:00
Matt Pharr
f47171a17c Don't check for "all off" mask at function entry.
We should never be running with an all off mask and thus should never
enter a function with an all off mask.  No performance change from
removing this, however.

Issue #282.
2012-06-15 10:14:53 -07:00
Matt Pharr
4945dc3682 Add contributors link to docs HTML templates 2012-06-13 06:11:08 -07:00
Matt Pharr
ada66b5313 Make more attempts to pull out constant offsets for gather/scatter.
The "base+offsets" variants of gather decompose the integer offsets into
compile-time constant and compile-time unknown elements.  (The coalescing
optimization, then, depends on this decomposition being done well--having
as much as possible in the constant component.)  We now make multiple
efforts to improve this decomposition as we run optimization passes; in
some cases we're able to move more over to the constant side than was
first possible.

This in particular fixes issue #276, a case where coalescing was expected
but didn't actually happen.
2012-06-12 16:21:14 -07:00
Matt Pharr
96450e17a3 Do all memory op improvements in a single optimization pass.
Rather than having separate passes to do conversion, when possible, of:

- General gather/scatter of a vector of pointers to g/s of
  a base pointer and integer offsets
- Gather/scatter to masked load/store, load+broadcast
- Masked load/store to regular load/store

Now all are done in a single ImproveMemoryOps pass.  This change was in
particular to address some phase ordering issues that showed up with
multidimensional array access wherein after determining that an outer
dimension had the same index value, we previously weren't able to take
advantage of the uniformity of the resulting pointer.
2012-06-12 13:56:17 -07:00
Matt Pharr
40a295e951 Fix bug where "avx-x2" target would cause AVX1.1 to be used. 2012-06-12 13:37:38 -07:00
Matt Pharr
d6c6f95373 Do all replacements of __pseudo* memory ops in a single optimization pass.
Collected the old PseudoGSToGSPass and PseudoMaskedStorePass into a single
pass, ReplacePseudoMemoryOpsPass, which handles both of their tasks.
2012-06-12 13:10:03 -07:00
Matt Pharr
19b46be20d Remove load_and_broadcast from built-ins.
Now that we never ever run with the mask all off, we no longer need
that logic in a built-in function so that we can check the mask.  In
the one place where it was used (turning gathers to the same location
into a load and broadcast), we now just emit the code for that
directly.
2012-06-12 12:30:57 -07:00