aaron/ispc - ispc - git.frat.tech

aaron/ispc

Author	SHA1	Message	Date
Matt Pharr	0c1b206185	Pass log/exp/pow transcendentals through to targets that support them. Currently, this is the generic targets.	2012-05-03 13:49:56 -07:00
Matt Pharr	0c5d7ff8f2	Add rygorous's float->srgb8 conversion routine to the stdlib. Issue #230	2012-04-27 10:03:19 -10:00
Matt Pharr	491fa239bd	Add atomic swap and cmpxchg for void * as well. Issue #232.	2012-04-11 06:12:31 -07:00
Matt Pharr	2aa61007c6	Remove memory_barrier() calls from atomics. This was unnecessary overhead to impose on all callers; the user should handle these as needed on their own. Also added some explanatory text to the documentation that highlights that memory_barrier() is only needed across HW threads/cores, not across program instances in a gang.	2012-04-10 19:37:03 -07:00
Matt Pharr	8878826661	Add non-short-circuiting and(), or(), select() to stdlib.	2012-03-26 09:37:59 -07:00
Matt Pharr	7e18f0e247	Small improvement to float->half function in stdlib. Rewrite things to be able to do a float MINPS, for slightly better code on SSE2 (which has that but not an signed int min). SSE2 code now 23 instructions (vs 21 intrinsics).	2012-03-23 16:09:32 -07:00
Matt Pharr	3bb2dee275	Update float_to_half() with more efficient version from @rygorous	2012-03-22 13:36:26 -07:00
Matt Pharr	10c5ba140c	Much more efficient half_to_float() code, via @rygorous. Also, switch deferred shading example to use it. (Rather than the "fast" half to float that doesn't handle deforms, etc.)	2012-03-21 16:13:04 -07:00
Matt Pharr	989966f81b	Annotate std lib functions with __declspec safe, cost, as appropriate.	2012-03-21 16:12:32 -07:00
Matt Pharr	7dffd65609	Add __foreach_active statement to loop over active prog. instances. For now this has the __ prefix, as an experimental feature currently only used in the standard library implementation. It's probably worth making something along these lines an official feature, but I'm not sure if this in its current form is quite the right thing.	2012-03-20 08:46:00 -07:00
Matt Pharr	3b95452481	Add memcpy(), memmove() and memset() to the standard library. Issue #183.	2012-03-05 16:09:00 -08:00
Matt Pharr	c152ae3c32	Add single-precision asin() and acos() to stdlib. Issue #184.	2012-03-05 13:32:13 -08:00
Matt Pharr	7bf9c11822	Add uniform variants of RNG functions to stdlib	2012-03-05 09:56:30 -08:00
Matt Pharr	55b81e35a7	Modify rules for default variability of pointed-to types. Now, the pointed-to type is always uniform by default (if an explicit rate qualifier isn't provided). This rule is easier to remember and seems to work well in more cases than the previous rule from `6d7ff7eba2`.	2012-02-29 14:27:53 -08:00
Matt Pharr	6d7ff7eba2	Update defaults for variability of pointed-to types. Now, if rate qualifiers aren't used to specify otherwise, varying pointers point to uniform types by default. As before, uniform pointers point to varying types by default. float foo; // varying pointer to uniform float float uniform foo; // uniform pointer to varying float These defaults seem to require the least amount of explicit uniform/varying qualifiers for most common cases, though TBD if it would be easier to have a single rule that e.g. the pointed-to type is always uniform by default.	2012-02-21 06:27:34 -08:00
Matt Pharr	c2ecc15b93	Add missing "varying/varying" atomic_compare_exchange_global() functions.	2012-02-03 13:19:15 -08:00
Matt Pharr	83c8650b36	Add support for "local" atomics. Also updated aobench example to use them, which in turn allows using foreach() and thence a much cleaner implementation. Issue #58.	2012-02-03 13:15:21 -08:00
Matt Pharr	b50f6f1730	Fix RNG seed code in stdlib for scalar target.	2012-01-29 13:46:57 -08:00
Matt Pharr	1867b5b317	Use native float/half conversion instructions with the AVX2 target.	2012-01-24 15:33:38 -08:00
Matt Pharr	d805e8b183	Add clock() function to standard library. Also corrected the declaration of num_cores() to return a uniform value.	2012-01-22 13:05:27 -08:00
Matt Pharr	1bba9d4307	Improve atomic_swap_global() to take advantage of associativity. We now do a single atomic hardware swap and then effectively do swaps between the running program instances such that the result is the same as if they had happened to run a particular ordering of hardware swaps themselves. Also cleaned up __atomic_swap_uniform_* built-in implementations to not take the mask, which they weren't using anyway. Finishes Issue #56.	2012-01-20 10:37:33 -08:00
Matt Pharr	3f89295d10	Update RNG code in stdlib to use -> operator where appropriate.	2012-01-19 10:02:47 -07:00
Matt Pharr	f75c94a8f1	Have aos/soa and broadcast/shuffle/rotate functions provided by the target. The SSE/AVX targets use the old versions from util.m4, but these functions are now passed through to the generic targets.	2012-01-04 12:59:03 -08:00
Matt Pharr	848a432640	Fix various small things that were broken with single-bit-per-lane masks. Also small cleanups to declarations, "no captures" added, etc.	2012-01-04 12:59:03 -08:00
Matt Pharr	1d9201fe3d	Add "generic" 4, 8, and 16-wide targets. When used, these targets end up with calls to undefined functions for all of the various special vector stuff ispc needs to compile ispc programs (masked store, gather, min/max, sqrt, etc.). These targets are not yet useful for anything, but are a step toward having an option to C++ code with calls out to intrinsics. Reorganized the directory structure a bit and put the LLVM bitcode used to define target-specific stuff (as well as some generic built-ins stuff) into a builtins/ directory. Note that for building on Windows, it's now necessary to set a LLVM_VERSION environment variable (with values like LLVM_2_9, LLVM_3_0, LLVM_3_1svn, etc.)	2011-12-19 13:46:50 -08:00
Matt Pharr	186d0223d2	Fix AoS/SoA stdlib functions to match documentation (i.e. actually remove the old offset parameter stuff now that we can actually pass pointers.)	2011-12-03 22:44:16 -08:00
Matt Pharr	7a2561c429	Add count_{leading,trailing}_zeros() functions to stdlib. (Documentation is still yet to be written.)	2011-11-30 10:12:16 -08:00
Matt Pharr	11547cb950	stdlib updates to take advantage of pointers The packed_{load,store}_active now functions take a pointer to a location at which to start loading/storing, rather than an array base and a uniform index. Variants of the prefetch functions that take varying pointers are now available. There are now variants of the various atomic functions that take varying pointers (issue #112).	2011-11-29 15:41:38 -08:00
Matt Pharr	975db80ef6	Add support for pointers to the language. Pointers can be either uniform or varying, and behave correspondingly. e.g.: "uniform float * varying" is a varying pointer to uniform float data in memory, and "float * uniform" is a uniform pointer to varying data in memory. Like other types, pointers are varying by default. Pointer-based expressions, & and *, sizeof, ->, pointer arithmetic, and the array/pointer duality all bahave as in C. Array arguments to functions are converted to pointers, also like C. There is a built-in NULL for a null pointer value; conversion from compile-time constant 0 values to NULL still needs to be implemented. Other changes: - Syntax for references has been updated to be C++ style; a useful warning is now issued if the "reference" keyword is used. - It is now illegal to pass a varying lvalue as a reference parameter to a function; references are essentially uniform pointers. This case had previously been handled via special case call by value return code. That path has been removed, now that varying pointers are available to handle this use case (and much more). - Some stdlib routines have been updated to take pointers as arguments where appropriate (e.g. prefetch and the atomics). A number of others still need attention. - All of the examples have been updated - Many new tests TODO: documentation	2011-11-27 13:09:59 -08:00
Matt Pharr	3cb0115dce	Add routines to standard library to do efficient AOS/SOA conversions. Currently, we just support 3 and 4-wide variants (i.e. xyzxyz.. and xyzwxyzw..), for int32 and float types.	2011-10-10 10:56:06 -07:00
Matt Pharr	32a0a30cf5	Only allow exact matches for function overload resolution for builtins. The intent is that the code in stdlib.ispc that is calling out to the built-ins should match argument types exactly (using explicit casts as needed), just for maximal clarity/safety.	2011-09-28 17:20:31 -07:00
Matt Pharr	6d39d5fc3e	Small cleanups. Add __num_cores() to the list of symbols to remove from the module at the end. Fix declarations of mask type for 64-bit atomics to silence warnings.	2011-09-28 16:26:35 -07:00
Matt Pharr	c999c8a237	Add num_cores() stdlib routine. Issue #102 .	2011-09-28 16:16:58 -07:00
Matt Pharr	aad269fdf4	Added support for 'uniform' global atomics. Issue #93.	2011-09-28 16:06:07 -07:00
Matt Pharr	4f451bd041	More AVX fixes Fix RNG state initialization for 16-wide targets Fix a number of bugs in reduce_add builtin implementations for AVX. Fix some tests that had incorrect expected results for the 16-wide case.	2011-09-06 15:53:11 -07:00
Matt Pharr	e144724979	Improve performance of global atomics, taking advantage of associativity. For associative atomic ops (add, and, or, xor), we can take advantage of their associativity to do just a single hardware atomic instruction, rather than one for each of the running program instances (as the previous implementation did.) The basic approach is to locally compute a reduction across the active program instances with the given op and to then issue a single HW atomic with that reduced value as the operand. We then take the old value that was stored in the location that is returned from the HW atomic op and use that to compute the values to return to each of the program instances (conceptually representing the cumulative effect of each of the preceding program instances having performed their atomic operation.) Issue #56.	2011-08-31 05:35:01 -07:00
Matt Pharr	606cbab0d4	Performance improvements for global min/max atomics. Issue #57 . Compute a "local" min/max across the active program instances and then do a single atomic memory op. Added a few tests to exercise global min/max atomics (which were previously untested!)	2011-08-26 10:35:24 -07:00
Matt Pharr	f868a63064	Add support for scan operations across program instances (add, and, or).	2011-08-13 20:11:41 +01:00
Matt Pharr	8c534d4d74	Add reduce_equal() function to standard library.	2011-08-10 15:55:55 -07:00
Matt Pharr	0ac4f7b620	Add various prefetch functions to the standard library.	2011-08-03 13:31:45 -07:00
Matt Pharr	467f1e71d7	Add fast versions of the float<-->half conversion routines in the stdlib. These get slightly wrong results for zero and the denorms and also don't handle the Inf/NaN stuff correctly, but are much more efficient than the full versions of these routines.	2011-08-03 15:58:42 +01:00
Matt Pharr	a2996ed5d9	More efficient implementation of frandom() in stdlib	2011-08-03 14:28:06 +01:00
Matt Pharr	165f90357f	Tiny cleanups, doc update re int8/16 performance	2011-07-21 16:04:16 +01:00
Matt Pharr	8ef3df57c5	Add support for in-memory half float data. Fixes issue #10	2011-07-21 15:55:45 +01:00
Matt Pharr	bba7211654	Add support for int8/int16 types. Addresses issues #9 and #42 .	2011-07-21 06:57:40 +01:00
Matt Pharr	a535aa586b	Fix issue #2 : use zero extend to convert bool->int, not sign extend. This way, we match C/C++ in that casting a bool to an int gives either the value zero or the value one. There is a new stdlib function int sign_extend(bool) that does sign extension for cases where that's desired.	2011-07-12 13:30:05 +01:00
Matt Pharr	aef8c09019	Add support for atomic swap/cmpexchg with float and double types. Addresses issue #60.	2011-07-07 14:07:52 +01:00
Matt Pharr	729f522a01	Fix bug in double-precision version of ldexp() in stdlib.	2011-07-07 13:57:20 +01:00
Matt Pharr	5a53a43ed0	Finish support for 64-bit types in stdlib. Fixes issue #14 . Add much more suppport for doubles and in64 types in the standard library, basically supporting everything for them that are supported for floats and int32s. (The notable exceptions being the approximate rcp() and rsqrt() functions, which don't really have sensible analogs for doubles (or at least not built-in instructions).)	2011-07-07 13:25:55 +01:00
Matt Pharr	5bcc611409	Implement global atomics and a memory barrier in the standard library. This checkin provides the standard set of atomic operations and a memory barrier in the ispc standard library. Both signed and unsigned 32- and 64-bit integer types are supported.	2011-07-04 17:20:42 +01:00

1 2

55 Commits