Commit Graph

82 Commits

Author SHA1 Message Date
Ilia Filippov
473f1cb4d2 packed_store_active2 2013-12-17 21:14:29 +04:00
james.brodman
4d289b16c2 Redesign after being hit with the KISS bat. 2013-10-23 14:25:43 -04:00
james.brodman
899f85ce9c Initial Support for new stdlib shift operator 2013-10-22 18:06:54 -04:00
Evghenii
6fd21d988d fixed lexer to properly read fortran-notation double constants 2013-09-16 17:15:02 +02:00
egaburov
e2a91e6de5 added support for "d"-suffix 2013-09-16 15:54:32 +02:00
Evghenii
36886971e3 revert lex.ll parse.yy stdlib.ispc to state when all constants are floats 2013-09-13 16:02:53 +02:00
Evghenii
a97eb7b7cb added clamp in double precision 2013-09-13 09:32:59 +02:00
egaburov
7364e06387 added mask64 2013-09-12 12:02:42 +02:00
egaburov
320c41ffcf added svml support. experimental. for some reason all sybmols are visible.. 2013-09-11 15:16:50 +02:00
james.brodman
8db378b265 Revert "Remove support for using SVML for math lib routines."
This reverts commit d9c38b5c1f.
2013-09-04 16:01:58 -04:00
Dmitry Babokin
e06267ef1b Fix for incorrect implementation of reduce_[min|max]_[float|double], it showed up as -O0 2013-08-29 16:16:02 +04:00
Matt Pharr
5b20b06bd9 Add avg_{up,down}_int{8,16} routines to stdlib
These compute the average of two given values, rounding up and down,
respectively, if the result isn't exact.  When possible, these are
mapped to target-specific intrinsics (PADD[BW] on IA and VH[R]ADD[US]
on NEON.)

A subsequent commit will add pattern-matching to generate calls to
these intrinsincs when the corresponding patterns are detected in the
IR.)
2013-08-06 08:41:12 -07:00
Matt Pharr
d9c38b5c1f Remove support for using SVML for math lib routines.
This path was poorly maintained and wasn't actually available on most
targets.
2013-07-31 06:56:48 -07:00
Matt Pharr
b6df447b55 Add reduce_add() for int8 and int16 types.
This maps to specialized instructions (e.g. PSADBW) when available.
2013-07-25 09:46:01 -07:00
Matt Pharr
f7f281a256 Choose type for integer literals to match the target mask size (if possible).
On a target with a 16-bit mask (for example), we would choose the type
of an integer literal "1024" to be an int16.  Previously, we used an int32,
which is a worse fit and leads to less efficient code than an int16
on a 16-bit mask target.  (However, we'd still give an integer literal
1000000 the type int32, even in a 16-bit target.)

Updated the tests to still pass with 8 and 16-bit targets, given this
change.
2013-07-23 17:24:50 -07:00
Matt Pharr
9ba49eabb2 Reduce estimated costs for 8 and 16-bit min() and max() in stdlib.
These actually compile to a single instruction.
2013-07-23 16:52:43 -07:00
Matt Pharr
e7abf3f2ea Add support for mask vectors of 8 and 16-bit element types.
There were a number of places throughout the system that assumed that the
execution mask would only have either 32-bit or 1-bit elements.  This
commit makes it possible to have a target with an 8- or 16-bit mask.
2013-07-23 16:50:11 -07:00
Matt Pharr
83e1630fbc Add support for fast division of varying int values by small constants.
For varying int8/16/32 types, divides by small constants can be
implemented efficiently through multiplies and shifts with integer
types of twice the bit-width; this commit adds this optimization.
    
(Implementation is based on Halide.)
2013-07-23 16:49:56 -07:00
Jean-Luc Duprat
6326924de7 Fixes to the implementations of any() and none() in the stdlib.
These make sure that inactive vector lanes do not interfere with the results
2013-01-18 11:19:54 -08:00
Jean-Luc Duprat
24087ff3cc Expose none() in the ISPC standard library.
On KNC: all(), any() and none() do not generate a redundant movmsk instruction.
2012-11-27 13:38:28 -08:00
Matt Pharr
6412876f64 Remove unused __reduce_add_uint{32,64} target functions.
The stdilb code just calls the signed int{32,64} functions,
which gives the right result for the unsigned case anyway.
The various targets didn't consistently define the unsigned
variants in any case.
2012-09-28 05:55:41 -07:00
Matt Pharr
2c640f7e52 Add support for RDRAND in IvyBridge.
The standard library now provides a variety of rdrand() functions
that call out to RDRAND, when available.

Issue #263.
2012-07-12 06:07:07 -07:00
Matt Pharr
b4a078e2f6 Add foreach_active iteration statement.
Issue #298.
2012-06-22 10:35:43 -07:00
Matt Pharr
46716aada3 Switch to unordered floating point compares.
In particular, this gives us desired behavior for NaNs (all compares
involving a NaN evaluate to true).  This in turn allows writing the
canonical isnan() function as "v != v".

Added isnan() to the standard library as well.
2012-06-20 13:25:53 -07:00
Matt Pharr
fae47e0dfc Update stdlib to not use "in" as a variable name.
Preparation for foreach_unique, which uses that as a keyword.
2012-06-20 10:04:24 -07:00
Matt Pharr
8fd9b84a80 Update seed_rng() in stdlib to take a varying seed.
Previously, we were trying to take a uniform seed and then shuffle that
around to initialize the state for each of the program instances.  This
was becoming increasingly untenable and brittle.

Now a varying seed is expected and used.
2012-05-30 10:35:41 -07:00
Matt Pharr
90db01d038 Represent MOVMSK'ed masks with int64s rather than int32s.
This allows us to scale up to 64-wide execution.
2012-05-25 11:57:23 -07:00
Matt Pharr
0c1b206185 Pass log/exp/pow transcendentals through to targets that support them.
Currently, this is the generic targets.
2012-05-03 13:49:56 -07:00
Matt Pharr
0c5d7ff8f2 Add rygorous's float->srgb8 conversion routine to the stdlib.
Issue #230
2012-04-27 10:03:19 -10:00
Matt Pharr
491fa239bd Add atomic swap and cmpxchg for void * as well.
Issue #232.
2012-04-11 06:12:31 -07:00
Matt Pharr
2aa61007c6 Remove memory_barrier() calls from atomics.
This was unnecessary overhead to impose on all callers; the user
should handle these as needed on their own.

Also added some explanatory text to the documentation that highlights
that memory_barrier() is only needed across HW threads/cores, not
across program instances in a gang.
2012-04-10 19:37:03 -07:00
Matt Pharr
8878826661 Add non-short-circuiting and(), or(), select() to stdlib. 2012-03-26 09:37:59 -07:00
Matt Pharr
7e18f0e247 Small improvement to float->half function in stdlib.
Rewrite things to be able to do a float MINPS, for slightly
better code on SSE2 (which has that but not an signed int
min).  SSE2 code now 23 instructions (vs 21 intrinsics).
2012-03-23 16:09:32 -07:00
Matt Pharr
3bb2dee275 Update float_to_half() with more efficient version from @rygorous 2012-03-22 13:36:26 -07:00
Matt Pharr
10c5ba140c Much more efficient half_to_float() code, via @rygorous.
Also, switch deferred shading example to use it. (Rather than
the "fast" half to float that doesn't handle deforms, etc.)
2012-03-21 16:13:04 -07:00
Matt Pharr
989966f81b Annotate std lib functions with __declspec safe, cost, as appropriate. 2012-03-21 16:12:32 -07:00
Matt Pharr
7dffd65609 Add __foreach_active statement to loop over active prog. instances.
For now this has the __ prefix, as an experimental feature currently only
used in the standard library implementation.  It's probably worth making
something along these lines an official feature, but I'm not sure if this
in its current form is quite the right thing.
2012-03-20 08:46:00 -07:00
Matt Pharr
3b95452481 Add memcpy(), memmove() and memset() to the standard library.
Issue #183.
2012-03-05 16:09:00 -08:00
Matt Pharr
c152ae3c32 Add single-precision asin() and acos() to stdlib.
Issue #184.
2012-03-05 13:32:13 -08:00
Matt Pharr
7bf9c11822 Add uniform variants of RNG functions to stdlib 2012-03-05 09:56:30 -08:00
Matt Pharr
55b81e35a7 Modify rules for default variability of pointed-to types.
Now, the pointed-to type is always uniform by default (if an explicit
rate qualifier isn't provided).  This rule is easier to remember and
seems to work well in more cases than the previous rule from 6d7ff7eba2.
2012-02-29 14:27:53 -08:00
Matt Pharr
6d7ff7eba2 Update defaults for variability of pointed-to types.
Now, if rate qualifiers aren't used to specify otherwise, varying
pointers point to uniform types by default.  As before, uniform
pointers point to varying types by default.

   float *foo;  // varying pointer to uniform float
   float * uniform foo;  // uniform pointer to varying float

These defaults seem to require the least amount of explicit
uniform/varying qualifiers for most common cases, though TBD if it
would be easier to have a single rule that e.g. the pointed-to type
is always uniform by default.
2012-02-21 06:27:34 -08:00
Matt Pharr
c2ecc15b93 Add missing "varying/varying" atomic_compare_exchange_global() functions. 2012-02-03 13:19:15 -08:00
Matt Pharr
83c8650b36 Add support for "local" atomics.
Also updated aobench example to use them, which in turn allows using
foreach() and thence a much cleaner implementation.

Issue #58.
2012-02-03 13:15:21 -08:00
Matt Pharr
b50f6f1730 Fix RNG seed code in stdlib for scalar target. 2012-01-29 13:46:57 -08:00
Matt Pharr
1867b5b317 Use native float/half conversion instructions with the AVX2 target. 2012-01-24 15:33:38 -08:00
Matt Pharr
d805e8b183 Add clock() function to standard library.
Also corrected the declaration of num_cores() to return a
uniform value.
2012-01-22 13:05:27 -08:00
Matt Pharr
1bba9d4307 Improve atomic_swap_global() to take advantage of associativity.
We now do a single atomic hardware swap and then effectively do 
swaps between the running program instances such that the result
is the same as if they had happened to run a particular ordering
of hardware swaps themselves.

Also cleaned up __atomic_swap_uniform_* built-in implementations
to not take the mask, which they weren't using anyway.

Finishes Issue #56.
2012-01-20 10:37:33 -08:00
Matt Pharr
3f89295d10 Update RNG code in stdlib to use -> operator where appropriate. 2012-01-19 10:02:47 -07:00
Matt Pharr
f75c94a8f1 Have aos/soa and broadcast/shuffle/rotate functions provided by the target.
The SSE/AVX targets use the old versions from util.m4, but these functions are
now passed through to the generic targets.
2012-01-04 12:59:03 -08:00