Ilia Filippov
e1524891fc
Merge pull request #751 from Vsevolod-Livinskij/master
...
Saturating multiplication for int64 was added.
2014-03-12 00:12:34 -07:00
Ilia Filippov
6738af0a0c
changing uniform_min and uniform_max implementations for avx targets
2014-03-06 12:05:24 +04:00
Vsevolod Livinskij
af836cda27
Saturating multiplication for int64 was added.
2014-02-23 19:48:03 +04:00
Dmitry Babokin
e8680760bf
Merge pull request #741 from Vsevolod-Livinskij/master
...
Saturation arithmetic.
2014-02-21 12:30:58 +03:00
Dmitry Babokin
17d8047a93
Merge pull request #746 from ifilippov/master
...
Adding cases of 'cast' instructions in optimizations
2014-02-21 12:25:58 +03:00
Ilia Filippov
42e00ebb24
adding cases of 'cast' instructions in optimizations
2014-02-21 13:00:16 +04:00
Vsevolod Livinskij
7dd7020c5f
Dec constants was changed with hex constants.
2014-02-20 22:57:24 +04:00
Dmitry Babokin
f280b32fa4
Merge pull request #736 from egaburov/native_trigonometry
...
Native trigonometry
2014-02-20 19:18:35 +03:00
Vsevolod Livinskij
735e6a8ab3
Saturation arithmetic mul and div for int8/int16/int32 and div for int64 was added
2014-02-18 02:07:13 +04:00
Vsevolod Livinskij
f5508db24f
Saturation arithmetic (sub and add) was added for int32/int64.
2014-02-17 18:55:40 +04:00
evghenii
193bba77b0
accuracy fix
2014-02-11 11:49:03 +01:00
Evghenii
f0779f95a3
added double precision tests
2014-02-11 11:40:40 +01:00
Vsevolod Livinskij
cef5b2eb04
Some changes in saturation arithmetic
2014-02-10 12:40:53 +04:00
Evghenii
fe98fe8cdc
added fast approximate rcp(double) accurate to 15 digits
2014-02-04 15:23:34 +01:00
Evghenii
eb1a495a7a
added support for fast approximate rsqrt(double). Provide 16 digit accurancy but is over 3x faster than 1/sqrt(double)
2014-02-04 14:44:54 +01:00
Evghenii
4515dd5c89
added tests for rcp/rsqrt double
2014-02-02 18:19:56 +01:00
Vsevolod Livinskij
97cc5b7f48
Added varying CFG and non-overflow part of the tests.
2014-01-06 15:24:09 +04:00
Vsevolod Livinskij
07c6f1714a
Some fixes in function names and more tests was added.
2013-12-22 19:28:26 +04:00
Dmitry Babokin
d666fc3f8f
Merge pull request #686 from ifilippov/ttt
...
packed_store_active2() - tuned version of packed_store_active()
2013-12-17 09:23:39 -08:00
Ilia Filippov
473f1cb4d2
packed_store_active2
2013-12-17 21:14:29 +04:00
Dmitry Babokin
6d51987e67
Merge pull request #642 from egaburov/launch3d
...
concept of 3d tasking
2013-12-17 08:40:07 -08:00
Evghenii
59b989d243
fix for --target=sse4-i18x16
2013-12-17 16:06:20 +01:00
Vsevolod Livinskij
9a135c48d9
Functions name change
2013-12-09 00:20:52 +04:00
Vsevolod Livinskij
ea94658411
Some saturation tests fixes
2013-12-06 17:20:37 +04:00
Vsevolod Livinskij
65768c20ae
Added tests for saturation and some fixes for generic and avx target
2013-12-05 00:34:14 +04:00
Ilia Filippov
4579d339ea
patch for LLVM 3.3 and test correction at avx2
2013-11-18 13:53:21 +04:00
james.brodman
ec17082864
Add unittest.
2013-10-30 17:21:10 -04:00
Dmitry Babokin
6585a925be
Merge pull request #641 from jbrodman/stdlibshift
...
Add a "shift" operator to the stdlib.
2013-10-28 14:18:31 -07:00
Evghenii
84a7a5d1cb
added tests for 3d launch
2013-10-26 16:16:28 +02:00
Ilia Filippov
814ee67519
patch and regression test for problem with vzeroupper
2013-10-24 16:03:55 +04:00
james.brodman
f97a2d68c8
Bugfix for non-const shift amt and unit tests.
2013-10-22 18:29:20 -04:00
Ilia Filippov
2e724b095e
support of operators
2013-10-18 13:45:15 +04:00
Dmitry Babokin
b2678b4338
Typo fix is tests/double-consts.ispc
2013-09-19 17:27:58 +04:00
Dmitry Babokin
1c527ae34c
Adding tests and vim support for double constant of the form .1d41
2013-09-19 12:49:45 +04:00
Dmitry Babokin
f45f6cb32a
Test, documentation and vim support for double precision constants
2013-09-19 12:49:45 +04:00
Matt Pharr
502f8fd76b
Reduce debug spew on failing idiv.ispc tests
2013-08-20 09:22:09 -07:00
Matt Pharr
d976da7559
Speed up idiv test (dont test int32 as thoroughly)
2013-08-20 08:49:51 -07:00
Matt Pharr
5b20b06bd9
Add avg_{up,down}_int{8,16} routines to stdlib
...
These compute the average of two given values, rounding up and down,
respectively, if the result isn't exact. When possible, these are
mapped to target-specific intrinsics (PADD[BW] on IA and VH[R]ADD[US]
on NEON.)
A subsequent commit will add pattern-matching to generate calls to
these intrinsincs when the corresponding patterns are detected in the
IR.)
2013-08-06 08:41:12 -07:00
Matt Pharr
b6df447b55
Add reduce_add() for int8 and int16 types.
...
This maps to specialized instructions (e.g. PSADBW) when available.
2013-07-25 09:46:01 -07:00
Matt Pharr
15a3ef370a
Use @llvm.readcyclecounter to implement stdlib clock() function.
...
Also added a test for the clock builtin.
2013-07-23 17:24:57 -07:00
Matt Pharr
f7f281a256
Choose type for integer literals to match the target mask size (if possible).
...
On a target with a 16-bit mask (for example), we would choose the type
of an integer literal "1024" to be an int16. Previously, we used an int32,
which is a worse fit and leads to less efficient code than an int16
on a 16-bit mask target. (However, we'd still give an integer literal
1000000 the type int32, even in a 16-bit target.)
Updated the tests to still pass with 8 and 16-bit targets, given this
change.
2013-07-23 17:24:50 -07:00
Matt Pharr
83e1630fbc
Add support for fast division of varying int values by small constants.
...
For varying int8/16/32 types, divides by small constants can be
implemented efficiently through multiplies and shifts with integer
types of twice the bit-width; this commit adds this optimization.
(Implementation is based on Halide.)
2013-07-23 16:49:56 -07:00
james.brodman
42d77e9191
Modified to mirror asin.ispc and not fail.
2013-01-08 14:33:32 -05:00
Matt Pharr
411d5b44ef
Add ISPC_HAS_RAND definition on targets that have a HW RNG.
...
This lets us check for a functioning rdrand() call in the stdlib
more reliably. Fixes issue #333 .
2012-10-03 09:18:12 -07:00
Matt Pharr
ddcd0a49ec
Fix bugs with handling of 'continue' statements in foreach_* loops.
2012-09-05 10:16:58 -07:00
Matt Pharr
43364b2d69
Loosen tolerances to test passes with FMA on AVX2
2012-08-10 06:52:14 -07:00
Matt Pharr
8f5189f606
Type convert arrays in select expressions to pointers to the first element.
...
Fixes issue #345 .
2012-08-03 11:53:59 -07:00
Matt Pharr
d180031ef0
Add more tests of basic gather functionality.
2012-07-12 14:05:38 -07:00
Matt Pharr
2c640f7e52
Add support for RDRAND in IvyBridge.
...
The standard library now provides a variety of rdrand() functions
that call out to RDRAND, when available.
Issue #263 .
2012-07-12 06:07:07 -07:00
Matt Pharr
926b3b9ee3
Fix bugs with mask-handling for switch/do/for/while statements.
...
All of these pass the current mask to FunctionEmitContext::SetBlockEntryMask()
so that when a break/continue/return is encountered, it can test to see if all
lanes have followed that path and then return; this in turn ensures that we never
run statements with an all-off execution mask.
These functions were passing the function internal mask, not the full mask, and
thus could end up executing code with the mask all off if some lanes were
disabled by an outer function. (The new tests test this case.)
2012-07-09 15:13:30 -07:00