Evghenii
4196c723eb
merged with nvptx
2014-02-20 11:01:58 +01:00
evghenii
732a315a4b
removed __declspec(safe) duplicate
2014-02-05 13:04:45 +01:00
Evghenii
686c1d676d
improvements
2014-02-05 12:04:36 +01:00
Evghenii
d3a6693eef
adding __have_native_{rsqrtd,rcpd} to select between native support for double precision reciprocals and using slower but safe version in stdlib
2014-02-04 16:29:23 +01:00
Evghenii
fe98fe8cdc
added fast approximate rcp(double) accurate to 15 digits
2014-02-04 15:23:34 +01:00
Evghenii
eb1a495a7a
added support for fast approximate rsqrt(double). Provide 16 digit accurancy but is over 3x faster than 1/sqrt(double)
2014-02-04 14:44:54 +01:00
Evghenii
b0753dc93d
added double-version for rcp
2014-02-02 18:20:05 +01:00
evghenii
3a72e05c3e
+1
2014-02-02 18:16:48 +01:00
Evghenii
5a6b650d8b
restored nonptx atomic_*_local
2014-01-28 15:56:30 +01:00
Evghenii
a3b00fdcd6
added support for global atomics
2014-01-26 14:23:26 +01:00
Evghenii
a7d4a3f922
fix for __any
2014-01-26 13:15:13 +01:00
Evghenii
fcbdd93043
half/scan for 64 bit/clock/num_cores and other additions
2014-01-25 16:43:33 +01:00
Evghenii
be6ac0408a
added compile-time constant __is_nvptx_traget that can be used with stdlib.ispc
2014-01-24 09:02:12 +01:00
Evghenii
1cf1dab649
fixed foreach_unique and local_atomics
2014-01-23 21:57:20 +01:00
Evghenii
f86de2be78
fix: laneIndex() must be varying
2014-01-09 09:41:57 +01:00
Evghenii
d77789d8fe
+merged with master
2013-12-18 11:37:01 +01:00
Ilia Filippov
473f1cb4d2
packed_store_active2
2013-12-17 21:14:29 +04:00
Evghenii
589538bf39
added stencil code
2013-11-18 12:04:00 +01:00
Evghenii
3dd6173a65
added packed_store_active that can be called with active flag
2013-11-11 12:25:15 +01:00
Evghenii
426afc7377
added workable .cu files for stencil & mandelbrot
2013-11-08 10:00:49 +01:00
egaburov
f19cf9274e
Merge remote-tracking branch 'upstream/master' into nvptx
2013-10-29 15:24:40 +01:00
Evghenii
8391d05697
added blockIndex computations
2013-10-28 10:18:30 +01:00
james.brodman
4d289b16c2
Redesign after being hit with the KISS bat.
2013-10-23 14:25:43 -04:00
james.brodman
899f85ce9c
Initial Support for new stdlib shift operator
2013-10-22 18:06:54 -04:00
Evghenii
6fd21d988d
fixed lexer to properly read fortran-notation double constants
2013-09-16 17:15:02 +02:00
egaburov
e2a91e6de5
added support for "d"-suffix
2013-09-16 15:54:32 +02:00
Evghenii
36886971e3
revert lex.ll parse.yy stdlib.ispc to state when all constants are floats
2013-09-13 16:02:53 +02:00
Evghenii
a97eb7b7cb
added clamp in double precision
2013-09-13 09:32:59 +02:00
egaburov
7364e06387
added mask64
2013-09-12 12:02:42 +02:00
egaburov
320c41ffcf
added svml support. experimental. for some reason all sybmols are visible..
2013-09-11 15:16:50 +02:00
james.brodman
8db378b265
Revert "Remove support for using SVML for math lib routines."
...
This reverts commit d9c38b5c1f .
2013-09-04 16:01:58 -04:00
Dmitry Babokin
e06267ef1b
Fix for incorrect implementation of reduce_[min|max]_[float|double], it showed up as -O0
2013-08-29 16:16:02 +04:00
Matt Pharr
5b20b06bd9
Add avg_{up,down}_int{8,16} routines to stdlib
...
These compute the average of two given values, rounding up and down,
respectively, if the result isn't exact. When possible, these are
mapped to target-specific intrinsics (PADD[BW] on IA and VH[R]ADD[US]
on NEON.)
A subsequent commit will add pattern-matching to generate calls to
these intrinsincs when the corresponding patterns are detected in the
IR.)
2013-08-06 08:41:12 -07:00
Matt Pharr
d9c38b5c1f
Remove support for using SVML for math lib routines.
...
This path was poorly maintained and wasn't actually available on most
targets.
2013-07-31 06:56:48 -07:00
Matt Pharr
b6df447b55
Add reduce_add() for int8 and int16 types.
...
This maps to specialized instructions (e.g. PSADBW) when available.
2013-07-25 09:46:01 -07:00
Matt Pharr
f7f281a256
Choose type for integer literals to match the target mask size (if possible).
...
On a target with a 16-bit mask (for example), we would choose the type
of an integer literal "1024" to be an int16. Previously, we used an int32,
which is a worse fit and leads to less efficient code than an int16
on a 16-bit mask target. (However, we'd still give an integer literal
1000000 the type int32, even in a 16-bit target.)
Updated the tests to still pass with 8 and 16-bit targets, given this
change.
2013-07-23 17:24:50 -07:00
Matt Pharr
9ba49eabb2
Reduce estimated costs for 8 and 16-bit min() and max() in stdlib.
...
These actually compile to a single instruction.
2013-07-23 16:52:43 -07:00
Matt Pharr
e7abf3f2ea
Add support for mask vectors of 8 and 16-bit element types.
...
There were a number of places throughout the system that assumed that the
execution mask would only have either 32-bit or 1-bit elements. This
commit makes it possible to have a target with an 8- or 16-bit mask.
2013-07-23 16:50:11 -07:00
Matt Pharr
83e1630fbc
Add support for fast division of varying int values by small constants.
...
For varying int8/16/32 types, divides by small constants can be
implemented efficiently through multiplies and shifts with integer
types of twice the bit-width; this commit adds this optimization.
(Implementation is based on Halide.)
2013-07-23 16:49:56 -07:00
Jean-Luc Duprat
6326924de7
Fixes to the implementations of any() and none() in the stdlib.
...
These make sure that inactive vector lanes do not interfere with the results
2013-01-18 11:19:54 -08:00
Jean-Luc Duprat
24087ff3cc
Expose none() in the ISPC standard library.
...
On KNC: all(), any() and none() do not generate a redundant movmsk instruction.
2012-11-27 13:38:28 -08:00
Matt Pharr
6412876f64
Remove unused __reduce_add_uint{32,64} target functions.
...
The stdilb code just calls the signed int{32,64} functions,
which gives the right result for the unsigned case anyway.
The various targets didn't consistently define the unsigned
variants in any case.
2012-09-28 05:55:41 -07:00
Matt Pharr
2c640f7e52
Add support for RDRAND in IvyBridge.
...
The standard library now provides a variety of rdrand() functions
that call out to RDRAND, when available.
Issue #263 .
2012-07-12 06:07:07 -07:00
Matt Pharr
b4a078e2f6
Add foreach_active iteration statement.
...
Issue #298 .
2012-06-22 10:35:43 -07:00
Matt Pharr
46716aada3
Switch to unordered floating point compares.
...
In particular, this gives us desired behavior for NaNs (all compares
involving a NaN evaluate to true). This in turn allows writing the
canonical isnan() function as "v != v".
Added isnan() to the standard library as well.
2012-06-20 13:25:53 -07:00
Matt Pharr
fae47e0dfc
Update stdlib to not use "in" as a variable name.
...
Preparation for foreach_unique, which uses that as a keyword.
2012-06-20 10:04:24 -07:00
Matt Pharr
8fd9b84a80
Update seed_rng() in stdlib to take a varying seed.
...
Previously, we were trying to take a uniform seed and then shuffle that
around to initialize the state for each of the program instances. This
was becoming increasingly untenable and brittle.
Now a varying seed is expected and used.
2012-05-30 10:35:41 -07:00
Matt Pharr
90db01d038
Represent MOVMSK'ed masks with int64s rather than int32s.
...
This allows us to scale up to 64-wide execution.
2012-05-25 11:57:23 -07:00
Matt Pharr
0c1b206185
Pass log/exp/pow transcendentals through to targets that support them.
...
Currently, this is the generic targets.
2012-05-03 13:49:56 -07:00
Matt Pharr
0c5d7ff8f2
Add rygorous's float->srgb8 conversion routine to the stdlib.
...
Issue #230
2012-04-27 10:03:19 -10:00