aaron/ispc - ispc - git.frat.tech

aaron/ispc

Author	SHA1	Message	Date
james.brodman	4d289b16c2	Redesign after being hit with the KISS bat.	2013-10-23 14:25:43 -04:00
james.brodman	899f85ce9c	Initial Support for new stdlib shift operator	2013-10-22 18:06:54 -04:00
Evghenii	6fd21d988d	fixed lexer to properly read fortran-notation double constants	2013-09-16 17:15:02 +02:00
egaburov	e2a91e6de5	added support for "d"-suffix	2013-09-16 15:54:32 +02:00
Evghenii	36886971e3	revert lex.ll parse.yy stdlib.ispc to state when all constants are floats	2013-09-13 16:02:53 +02:00
Evghenii	a97eb7b7cb	added clamp in double precision	2013-09-13 09:32:59 +02:00
egaburov	7364e06387	added mask64	2013-09-12 12:02:42 +02:00
egaburov	320c41ffcf	added svml support. experimental. for some reason all sybmols are visible..	2013-09-11 15:16:50 +02:00
james.brodman	8db378b265	Revert "Remove support for using SVML for math lib routines." This reverts commit `d9c38b5c1f`.	2013-09-04 16:01:58 -04:00
Dmitry Babokin	e06267ef1b	Fix for incorrect implementation of reduce_[min\|max]_[float\|double], it showed up as -O0	2013-08-29 16:16:02 +04:00
Matt Pharr	5b20b06bd9	Add avg_{up,down}_int{8,16} routines to stdlib These compute the average of two given values, rounding up and down, respectively, if the result isn't exact. When possible, these are mapped to target-specific intrinsics (PADD[BW] on IA and VH[R]ADD[US] on NEON.) A subsequent commit will add pattern-matching to generate calls to these intrinsincs when the corresponding patterns are detected in the IR.)	2013-08-06 08:41:12 -07:00
Matt Pharr	d9c38b5c1f	Remove support for using SVML for math lib routines. This path was poorly maintained and wasn't actually available on most targets.	2013-07-31 06:56:48 -07:00
Matt Pharr	b6df447b55	Add reduce_add() for int8 and int16 types. This maps to specialized instructions (e.g. PSADBW) when available.	2013-07-25 09:46:01 -07:00
Matt Pharr	f7f281a256	Choose type for integer literals to match the target mask size (if possible). On a target with a 16-bit mask (for example), we would choose the type of an integer literal "1024" to be an int16. Previously, we used an int32, which is a worse fit and leads to less efficient code than an int16 on a 16-bit mask target. (However, we'd still give an integer literal 1000000 the type int32, even in a 16-bit target.) Updated the tests to still pass with 8 and 16-bit targets, given this change.	2013-07-23 17:24:50 -07:00
Matt Pharr	9ba49eabb2	Reduce estimated costs for 8 and 16-bit min() and max() in stdlib. These actually compile to a single instruction.	2013-07-23 16:52:43 -07:00
Matt Pharr	e7abf3f2ea	Add support for mask vectors of 8 and 16-bit element types. There were a number of places throughout the system that assumed that the execution mask would only have either 32-bit or 1-bit elements. This commit makes it possible to have a target with an 8- or 16-bit mask.	2013-07-23 16:50:11 -07:00
Matt Pharr	83e1630fbc	Add support for fast division of varying int values by small constants. For varying int8/16/32 types, divides by small constants can be implemented efficiently through multiplies and shifts with integer types of twice the bit-width; this commit adds this optimization. (Implementation is based on Halide.)	2013-07-23 16:49:56 -07:00
Jean-Luc Duprat	6326924de7	Fixes to the implementations of any() and none() in the stdlib. These make sure that inactive vector lanes do not interfere with the results	2013-01-18 11:19:54 -08:00
Jean-Luc Duprat	24087ff3cc	Expose none() in the ISPC standard library. On KNC: all(), any() and none() do not generate a redundant movmsk instruction.	2012-11-27 13:38:28 -08:00
Matt Pharr	6412876f64	Remove unused __reduce_add_uint{32,64} target functions. The stdilb code just calls the signed int{32,64} functions, which gives the right result for the unsigned case anyway. The various targets didn't consistently define the unsigned variants in any case.	2012-09-28 05:55:41 -07:00
Matt Pharr	2c640f7e52	Add support for RDRAND in IvyBridge. The standard library now provides a variety of rdrand() functions that call out to RDRAND, when available. Issue #263.	2012-07-12 06:07:07 -07:00
Matt Pharr	b4a078e2f6	Add foreach_active iteration statement. Issue #298.	2012-06-22 10:35:43 -07:00
Matt Pharr	46716aada3	Switch to unordered floating point compares. In particular, this gives us desired behavior for NaNs (all compares involving a NaN evaluate to true). This in turn allows writing the canonical isnan() function as "v != v". Added isnan() to the standard library as well.	2012-06-20 13:25:53 -07:00
Matt Pharr	fae47e0dfc	Update stdlib to not use "in" as a variable name. Preparation for foreach_unique, which uses that as a keyword.	2012-06-20 10:04:24 -07:00
Matt Pharr	8fd9b84a80	Update seed_rng() in stdlib to take a varying seed. Previously, we were trying to take a uniform seed and then shuffle that around to initialize the state for each of the program instances. This was becoming increasingly untenable and brittle. Now a varying seed is expected and used.	2012-05-30 10:35:41 -07:00
Matt Pharr	90db01d038	Represent MOVMSK'ed masks with int64s rather than int32s. This allows us to scale up to 64-wide execution.	2012-05-25 11:57:23 -07:00
Matt Pharr	0c1b206185	Pass log/exp/pow transcendentals through to targets that support them. Currently, this is the generic targets.	2012-05-03 13:49:56 -07:00
Matt Pharr	0c5d7ff8f2	Add rygorous's float->srgb8 conversion routine to the stdlib. Issue #230	2012-04-27 10:03:19 -10:00
Matt Pharr	491fa239bd	Add atomic swap and cmpxchg for void * as well. Issue #232.	2012-04-11 06:12:31 -07:00
Matt Pharr	2aa61007c6	Remove memory_barrier() calls from atomics. This was unnecessary overhead to impose on all callers; the user should handle these as needed on their own. Also added some explanatory text to the documentation that highlights that memory_barrier() is only needed across HW threads/cores, not across program instances in a gang.	2012-04-10 19:37:03 -07:00
Matt Pharr	8878826661	Add non-short-circuiting and(), or(), select() to stdlib.	2012-03-26 09:37:59 -07:00
Matt Pharr	7e18f0e247	Small improvement to float->half function in stdlib. Rewrite things to be able to do a float MINPS, for slightly better code on SSE2 (which has that but not an signed int min). SSE2 code now 23 instructions (vs 21 intrinsics).	2012-03-23 16:09:32 -07:00
Matt Pharr	3bb2dee275	Update float_to_half() with more efficient version from @rygorous	2012-03-22 13:36:26 -07:00
Matt Pharr	10c5ba140c	Much more efficient half_to_float() code, via @rygorous. Also, switch deferred shading example to use it. (Rather than the "fast" half to float that doesn't handle deforms, etc.)	2012-03-21 16:13:04 -07:00
Matt Pharr	989966f81b	Annotate std lib functions with __declspec safe, cost, as appropriate.	2012-03-21 16:12:32 -07:00
Matt Pharr	7dffd65609	Add __foreach_active statement to loop over active prog. instances. For now this has the __ prefix, as an experimental feature currently only used in the standard library implementation. It's probably worth making something along these lines an official feature, but I'm not sure if this in its current form is quite the right thing.	2012-03-20 08:46:00 -07:00
Matt Pharr	3b95452481	Add memcpy(), memmove() and memset() to the standard library. Issue #183.	2012-03-05 16:09:00 -08:00
Matt Pharr	c152ae3c32	Add single-precision asin() and acos() to stdlib. Issue #184.	2012-03-05 13:32:13 -08:00
Matt Pharr	7bf9c11822	Add uniform variants of RNG functions to stdlib	2012-03-05 09:56:30 -08:00
Matt Pharr	55b81e35a7	Modify rules for default variability of pointed-to types. Now, the pointed-to type is always uniform by default (if an explicit rate qualifier isn't provided). This rule is easier to remember and seems to work well in more cases than the previous rule from `6d7ff7eba2`.	2012-02-29 14:27:53 -08:00
Matt Pharr	6d7ff7eba2	Update defaults for variability of pointed-to types. Now, if rate qualifiers aren't used to specify otherwise, varying pointers point to uniform types by default. As before, uniform pointers point to varying types by default. float foo; // varying pointer to uniform float float uniform foo; // uniform pointer to varying float These defaults seem to require the least amount of explicit uniform/varying qualifiers for most common cases, though TBD if it would be easier to have a single rule that e.g. the pointed-to type is always uniform by default.	2012-02-21 06:27:34 -08:00
Matt Pharr	c2ecc15b93	Add missing "varying/varying" atomic_compare_exchange_global() functions.	2012-02-03 13:19:15 -08:00
Matt Pharr	83c8650b36	Add support for "local" atomics. Also updated aobench example to use them, which in turn allows using foreach() and thence a much cleaner implementation. Issue #58.	2012-02-03 13:15:21 -08:00
Matt Pharr	b50f6f1730	Fix RNG seed code in stdlib for scalar target.	2012-01-29 13:46:57 -08:00
Matt Pharr	1867b5b317	Use native float/half conversion instructions with the AVX2 target.	2012-01-24 15:33:38 -08:00
Matt Pharr	d805e8b183	Add clock() function to standard library. Also corrected the declaration of num_cores() to return a uniform value.	2012-01-22 13:05:27 -08:00
Matt Pharr	1bba9d4307	Improve atomic_swap_global() to take advantage of associativity. We now do a single atomic hardware swap and then effectively do swaps between the running program instances such that the result is the same as if they had happened to run a particular ordering of hardware swaps themselves. Also cleaned up __atomic_swap_uniform_* built-in implementations to not take the mask, which they weren't using anyway. Finishes Issue #56.	2012-01-20 10:37:33 -08:00
Matt Pharr	3f89295d10	Update RNG code in stdlib to use -> operator where appropriate.	2012-01-19 10:02:47 -07:00
Matt Pharr	f75c94a8f1	Have aos/soa and broadcast/shuffle/rotate functions provided by the target. The SSE/AVX targets use the old versions from util.m4, but these functions are now passed through to the generic targets.	2012-01-04 12:59:03 -08:00
Matt Pharr	848a432640	Fix various small things that were broken with single-bit-per-lane masks. Also small cleanups to declarations, "no captures" added, etc.	2012-01-04 12:59:03 -08:00

1 2

81 Commits