Commit Graph

160 Commits

Author SHA1 Message Date
Evghenii
b7b5c9ad1d it is illegal to pass varying parapamter to a task function with nvptx target 2014-01-27 10:30:09 +01:00
Evghenii
1c2dbd6a27 a fix for .b0 ptx and some other code improvements 2014-01-27 08:51:05 +01:00
Evghenii
4ecf30530a fixed for operator2 with nvptx target 2014-01-26 15:08:25 +01:00
Evghenii
fcbdd93043 half/scan for 64 bit/clock/num_cores and other additions 2014-01-25 16:43:33 +01:00
Evghenii
9090d8b128 added support for assert 2014-01-24 12:18:20 +01:00
Evghenii
5a8351d7ea added varying new/delete 2014-01-24 09:22:55 +01:00
Evghenii
da7a2c0c7f added emulation of "soa" data types via shared-memory 2014-01-23 16:17:06 +01:00
Evghenii
2e7609156a fixes for exclclusive_scan_and/or_i32 and shuffle2 and __movmsk 2014-01-23 10:24:44 +01:00
Evghenii
06313e0ec3 exclusive_scan_and is supported, but must be called outside if-statements. in pricniple other must do the same 2014-01-22 22:12:51 +01:00
Evghenii
5376743281 added "const" before "static unfiform" in constant folding tests 2014-01-21 14:59:25 +01:00
Evghenii
215abab544 bugfix 2014-01-21 14:55:41 +01:00
Evghenii
bc99897fbb +fixed some example, found some bugs, and bugs in ptxas/cuda 2014-01-21 14:51:27 +01:00
Evghenii
5a773ed62a some cfor tests fixes for > 16 lanes 2014-01-20 16:42:33 +01:00
Evghenii
4581f10207 some changes 2014-01-20 13:46:49 +01:00
Evghenii
de4d66c56f added addrspace(4)/constant memory for const uniform declarations 2014-01-08 13:27:24 +01:00
Evghenii
8347c766f0 added uniform memory test. 2014-01-08 11:16:51 +01:00
Dmitry Babokin
d666fc3f8f Merge pull request #686 from ifilippov/ttt
packed_store_active2() - tuned version of packed_store_active()
2013-12-17 09:23:39 -08:00
Ilia Filippov
473f1cb4d2 packed_store_active2 2013-12-17 21:14:29 +04:00
Dmitry Babokin
6d51987e67 Merge pull request #642 from egaburov/launch3d
concept of 3d tasking
2013-12-17 08:40:07 -08:00
Evghenii
59b989d243 fix for --target=sse4-i18x16 2013-12-17 16:06:20 +01:00
Ilia Filippov
4579d339ea patch for LLVM 3.3 and test correction at avx2 2013-11-18 13:53:21 +04:00
james.brodman
ec17082864 Add unittest. 2013-10-30 17:21:10 -04:00
Dmitry Babokin
6585a925be Merge pull request #641 from jbrodman/stdlibshift
Add a "shift" operator to the stdlib.
2013-10-28 14:18:31 -07:00
Evghenii
84a7a5d1cb added tests for 3d launch 2013-10-26 16:16:28 +02:00
Ilia Filippov
814ee67519 patch and regression test for problem with vzeroupper 2013-10-24 16:03:55 +04:00
james.brodman
f97a2d68c8 Bugfix for non-const shift amt and unit tests. 2013-10-22 18:29:20 -04:00
Ilia Filippov
2e724b095e support of operators 2013-10-18 13:45:15 +04:00
Dmitry Babokin
b2678b4338 Typo fix is tests/double-consts.ispc 2013-09-19 17:27:58 +04:00
Dmitry Babokin
1c527ae34c Adding tests and vim support for double constant of the form .1d41 2013-09-19 12:49:45 +04:00
Dmitry Babokin
f45f6cb32a Test, documentation and vim support for double precision constants 2013-09-19 12:49:45 +04:00
Matt Pharr
502f8fd76b Reduce debug spew on failing idiv.ispc tests 2013-08-20 09:22:09 -07:00
Matt Pharr
d976da7559 Speed up idiv test (dont test int32 as thoroughly) 2013-08-20 08:49:51 -07:00
Matt Pharr
5b20b06bd9 Add avg_{up,down}_int{8,16} routines to stdlib
These compute the average of two given values, rounding up and down,
respectively, if the result isn't exact.  When possible, these are
mapped to target-specific intrinsics (PADD[BW] on IA and VH[R]ADD[US]
on NEON.)

A subsequent commit will add pattern-matching to generate calls to
these intrinsincs when the corresponding patterns are detected in the
IR.)
2013-08-06 08:41:12 -07:00
Matt Pharr
b6df447b55 Add reduce_add() for int8 and int16 types.
This maps to specialized instructions (e.g. PSADBW) when available.
2013-07-25 09:46:01 -07:00
Matt Pharr
15a3ef370a Use @llvm.readcyclecounter to implement stdlib clock() function.
Also added a test for the clock builtin.
2013-07-23 17:24:57 -07:00
Matt Pharr
f7f281a256 Choose type for integer literals to match the target mask size (if possible).
On a target with a 16-bit mask (for example), we would choose the type
of an integer literal "1024" to be an int16.  Previously, we used an int32,
which is a worse fit and leads to less efficient code than an int16
on a 16-bit mask target.  (However, we'd still give an integer literal
1000000 the type int32, even in a 16-bit target.)

Updated the tests to still pass with 8 and 16-bit targets, given this
change.
2013-07-23 17:24:50 -07:00
Matt Pharr
83e1630fbc Add support for fast division of varying int values by small constants.
For varying int8/16/32 types, divides by small constants can be
implemented efficiently through multiplies and shifts with integer
types of twice the bit-width; this commit adds this optimization.
    
(Implementation is based on Halide.)
2013-07-23 16:49:56 -07:00
james.brodman
42d77e9191 Modified to mirror asin.ispc and not fail. 2013-01-08 14:33:32 -05:00
Matt Pharr
411d5b44ef Add ISPC_HAS_RAND definition on targets that have a HW RNG.
This lets us check for a functioning rdrand() call in the stdlib
more reliably.  Fixes issue #333.
2012-10-03 09:18:12 -07:00
Matt Pharr
ddcd0a49ec Fix bugs with handling of 'continue' statements in foreach_* loops. 2012-09-05 10:16:58 -07:00
Matt Pharr
43364b2d69 Loosen tolerances to test passes with FMA on AVX2 2012-08-10 06:52:14 -07:00
Matt Pharr
8f5189f606 Type convert arrays in select expressions to pointers to the first element.
Fixes issue #345.
2012-08-03 11:53:59 -07:00
Matt Pharr
d180031ef0 Add more tests of basic gather functionality. 2012-07-12 14:05:38 -07:00
Matt Pharr
2c640f7e52 Add support for RDRAND in IvyBridge.
The standard library now provides a variety of rdrand() functions
that call out to RDRAND, when available.

Issue #263.
2012-07-12 06:07:07 -07:00
Matt Pharr
926b3b9ee3 Fix bugs with mask-handling for switch/do/for/while statements.
All of these pass the current mask to FunctionEmitContext::SetBlockEntryMask()
so that when a break/continue/return is encountered, it can test to see if all
lanes have followed that path and then return; this in turn ensures that we never
run statements with an all-off execution mask.

These functions were passing the function internal mask, not the full mask, and
thus could end up executing code with the mask all off if some lanes were
disabled by an outer function.  (The new tests test this case.)
2012-07-09 15:13:30 -07:00
Matt Pharr
950a989744 Add test that was supposed to go with 080241b7d1 2012-07-09 08:21:15 -07:00
Matt Pharr
54459255d4 Add unmasked { } statement.
This reestablishes an "all on" execution mask for the gang, which can
be useful for nested parallelism..
2012-06-22 14:30:58 -07:00
Matt Pharr
b4a078e2f6 Add foreach_active iteration statement.
Issue #298.
2012-06-22 10:35:43 -07:00
Matt Pharr
5a2c8342eb Allow structs with no members.
Issue #289.
2012-06-21 16:07:31 -07:00
Matt Pharr
46716aada3 Switch to unordered floating point compares.
In particular, this gives us desired behavior for NaNs (all compares
involving a NaN evaluate to true).  This in turn allows writing the
canonical isnan() function as "v != v".

Added isnan() to the standard library as well.
2012-06-20 13:25:53 -07:00