Evghenii
b7b5c9ad1d
it is illegal to pass varying parapamter to a task function with nvptx target
2014-01-27 10:30:09 +01:00
Evghenii
1c2dbd6a27
a fix for .b0 ptx and some other code improvements
2014-01-27 08:51:05 +01:00
Evghenii
4ecf30530a
fixed for operator2 with nvptx target
2014-01-26 15:08:25 +01:00
Evghenii
fcbdd93043
half/scan for 64 bit/clock/num_cores and other additions
2014-01-25 16:43:33 +01:00
Evghenii
9090d8b128
added support for assert
2014-01-24 12:18:20 +01:00
Evghenii
5a8351d7ea
added varying new/delete
2014-01-24 09:22:55 +01:00
Evghenii
da7a2c0c7f
added emulation of "soa" data types via shared-memory
2014-01-23 16:17:06 +01:00
Evghenii
2e7609156a
fixes for exclclusive_scan_and/or_i32 and shuffle2 and __movmsk
2014-01-23 10:24:44 +01:00
Evghenii
06313e0ec3
exclusive_scan_and is supported, but must be called outside if-statements. in pricniple other must do the same
2014-01-22 22:12:51 +01:00
Evghenii
5376743281
added "const" before "static unfiform" in constant folding tests
2014-01-21 14:59:25 +01:00
Evghenii
215abab544
bugfix
2014-01-21 14:55:41 +01:00
Evghenii
bc99897fbb
+fixed some example, found some bugs, and bugs in ptxas/cuda
2014-01-21 14:51:27 +01:00
Evghenii
5a773ed62a
some cfor tests fixes for > 16 lanes
2014-01-20 16:42:33 +01:00
Evghenii
4581f10207
some changes
2014-01-20 13:46:49 +01:00
Evghenii
de4d66c56f
added addrspace(4)/constant memory for const uniform declarations
2014-01-08 13:27:24 +01:00
Evghenii
8347c766f0
added uniform memory test.
2014-01-08 11:16:51 +01:00
Dmitry Babokin
d666fc3f8f
Merge pull request #686 from ifilippov/ttt
...
packed_store_active2() - tuned version of packed_store_active()
2013-12-17 09:23:39 -08:00
Ilia Filippov
473f1cb4d2
packed_store_active2
2013-12-17 21:14:29 +04:00
Dmitry Babokin
6d51987e67
Merge pull request #642 from egaburov/launch3d
...
concept of 3d tasking
2013-12-17 08:40:07 -08:00
Evghenii
59b989d243
fix for --target=sse4-i18x16
2013-12-17 16:06:20 +01:00
Ilia Filippov
4579d339ea
patch for LLVM 3.3 and test correction at avx2
2013-11-18 13:53:21 +04:00
james.brodman
ec17082864
Add unittest.
2013-10-30 17:21:10 -04:00
Dmitry Babokin
6585a925be
Merge pull request #641 from jbrodman/stdlibshift
...
Add a "shift" operator to the stdlib.
2013-10-28 14:18:31 -07:00
Evghenii
84a7a5d1cb
added tests for 3d launch
2013-10-26 16:16:28 +02:00
Ilia Filippov
814ee67519
patch and regression test for problem with vzeroupper
2013-10-24 16:03:55 +04:00
james.brodman
f97a2d68c8
Bugfix for non-const shift amt and unit tests.
2013-10-22 18:29:20 -04:00
Ilia Filippov
2e724b095e
support of operators
2013-10-18 13:45:15 +04:00
Dmitry Babokin
b2678b4338
Typo fix is tests/double-consts.ispc
2013-09-19 17:27:58 +04:00
Dmitry Babokin
1c527ae34c
Adding tests and vim support for double constant of the form .1d41
2013-09-19 12:49:45 +04:00
Dmitry Babokin
f45f6cb32a
Test, documentation and vim support for double precision constants
2013-09-19 12:49:45 +04:00
Matt Pharr
502f8fd76b
Reduce debug spew on failing idiv.ispc tests
2013-08-20 09:22:09 -07:00
Matt Pharr
d976da7559
Speed up idiv test (dont test int32 as thoroughly)
2013-08-20 08:49:51 -07:00
Matt Pharr
5b20b06bd9
Add avg_{up,down}_int{8,16} routines to stdlib
...
These compute the average of two given values, rounding up and down,
respectively, if the result isn't exact. When possible, these are
mapped to target-specific intrinsics (PADD[BW] on IA and VH[R]ADD[US]
on NEON.)
A subsequent commit will add pattern-matching to generate calls to
these intrinsincs when the corresponding patterns are detected in the
IR.)
2013-08-06 08:41:12 -07:00
Matt Pharr
b6df447b55
Add reduce_add() for int8 and int16 types.
...
This maps to specialized instructions (e.g. PSADBW) when available.
2013-07-25 09:46:01 -07:00
Matt Pharr
15a3ef370a
Use @llvm.readcyclecounter to implement stdlib clock() function.
...
Also added a test for the clock builtin.
2013-07-23 17:24:57 -07:00
Matt Pharr
f7f281a256
Choose type for integer literals to match the target mask size (if possible).
...
On a target with a 16-bit mask (for example), we would choose the type
of an integer literal "1024" to be an int16. Previously, we used an int32,
which is a worse fit and leads to less efficient code than an int16
on a 16-bit mask target. (However, we'd still give an integer literal
1000000 the type int32, even in a 16-bit target.)
Updated the tests to still pass with 8 and 16-bit targets, given this
change.
2013-07-23 17:24:50 -07:00
Matt Pharr
83e1630fbc
Add support for fast division of varying int values by small constants.
...
For varying int8/16/32 types, divides by small constants can be
implemented efficiently through multiplies and shifts with integer
types of twice the bit-width; this commit adds this optimization.
(Implementation is based on Halide.)
2013-07-23 16:49:56 -07:00
james.brodman
42d77e9191
Modified to mirror asin.ispc and not fail.
2013-01-08 14:33:32 -05:00
Matt Pharr
411d5b44ef
Add ISPC_HAS_RAND definition on targets that have a HW RNG.
...
This lets us check for a functioning rdrand() call in the stdlib
more reliably. Fixes issue #333 .
2012-10-03 09:18:12 -07:00
Matt Pharr
ddcd0a49ec
Fix bugs with handling of 'continue' statements in foreach_* loops.
2012-09-05 10:16:58 -07:00
Matt Pharr
43364b2d69
Loosen tolerances to test passes with FMA on AVX2
2012-08-10 06:52:14 -07:00
Matt Pharr
8f5189f606
Type convert arrays in select expressions to pointers to the first element.
...
Fixes issue #345 .
2012-08-03 11:53:59 -07:00
Matt Pharr
d180031ef0
Add more tests of basic gather functionality.
2012-07-12 14:05:38 -07:00
Matt Pharr
2c640f7e52
Add support for RDRAND in IvyBridge.
...
The standard library now provides a variety of rdrand() functions
that call out to RDRAND, when available.
Issue #263 .
2012-07-12 06:07:07 -07:00
Matt Pharr
926b3b9ee3
Fix bugs with mask-handling for switch/do/for/while statements.
...
All of these pass the current mask to FunctionEmitContext::SetBlockEntryMask()
so that when a break/continue/return is encountered, it can test to see if all
lanes have followed that path and then return; this in turn ensures that we never
run statements with an all-off execution mask.
These functions were passing the function internal mask, not the full mask, and
thus could end up executing code with the mask all off if some lanes were
disabled by an outer function. (The new tests test this case.)
2012-07-09 15:13:30 -07:00
Matt Pharr
950a989744
Add test that was supposed to go with 080241b7d1
2012-07-09 08:21:15 -07:00
Matt Pharr
54459255d4
Add unmasked { } statement.
...
This reestablishes an "all on" execution mask for the gang, which can
be useful for nested parallelism..
2012-06-22 14:30:58 -07:00
Matt Pharr
b4a078e2f6
Add foreach_active iteration statement.
...
Issue #298 .
2012-06-22 10:35:43 -07:00
Matt Pharr
5a2c8342eb
Allow structs with no members.
...
Issue #289 .
2012-06-21 16:07:31 -07:00
Matt Pharr
46716aada3
Switch to unordered floating point compares.
...
In particular, this gives us desired behavior for NaNs (all compares
involving a NaN evaluate to true). This in turn allows writing the
canonical isnan() function as "v != v".
Added isnan() to the standard library as well.
2012-06-20 13:25:53 -07:00