aaron/ispc - ispc - git.frat.tech

aaron/ispc

Author	SHA1	Message	Date
Dmitry Babokin	8e47273186	Copyright refresh	2015-04-22 16:39:11 +03:00
Vsevolod Livinskiy	f92d351cf0	Some codestyle changes	2015-03-05 18:04:39 +03:00
Vsevolod Livinskiy	a216b2bb9c	New LLVM IR load instruction	2015-03-05 16:00:30 +03:00
Dmitry Babokin	31b95b665b	Copyright update	2014-03-12 20:19:16 +04:00
Dmitry Babokin	f280b32fa4	Merge pull request #736 from egaburov/native_trigonometry Native trigonometry	2014-02-20 19:18:35 +03:00
Vsevolod Livinskij	cef5b2eb04	Some changes in saturation arithmetic	2014-02-10 12:40:53 +04:00
Evghenii	70a9b286e5	added support for native and double precision trigonometry/transendentals	2014-02-07 15:28:39 +01:00
evghenii	09e8381ec7	change {rsqrt,rcp}_double to {rsqrt,rcp}d_decl	2014-02-05 13:05:04 +01:00
evghenii	3a72e05c3e	+1	2014-02-02 18:16:48 +01:00
Vsevolod Livinskij	07c6f1714a	Some fixes in function names and more tests was added.	2013-12-22 19:28:26 +04:00
Vsevolod Livinskij	4faff1a63c	structural change	2013-11-30 10:48:18 +04:00
Vsevolod Livinskij	4c330bc38b	Add code generation of saturation	2013-11-29 18:40:04 +04:00
Vsevolod Livinskij	42c148bf75	Changes for sse2 and sse4 in saturation	2013-11-29 03:33:40 +04:00
egaburov	efc20c2110	added svml support to all sse/avx modes	2013-09-11 17:07:54 +02:00
egaburov	19379db3b6	svml cleanup	2013-09-11 16:48:56 +02:00
egaburov	320c41ffcf	added svml support. experimental. for some reason all sybmols are visible..	2013-09-11 15:16:50 +02:00
james.brodman	8db378b265	Revert "Remove support for using SVML for math lib routines." This reverts commit `d9c38b5c1f`.	2013-09-04 16:01:58 -04:00
Matt Pharr	5b20b06bd9	Add avg_{up,down}_int{8,16} routines to stdlib These compute the average of two given values, rounding up and down, respectively, if the result isn't exact. When possible, these are mapped to target-specific intrinsics (PADD[BW] on IA and VH[R]ADD[US] on NEON.) A subsequent commit will add pattern-matching to generate calls to these intrinsincs when the corresponding patterns are detected in the IR.)	2013-08-06 08:41:12 -07:00
Matt Pharr	d9c38b5c1f	Remove support for using SVML for math lib routines. This path was poorly maintained and wasn't actually available on most targets.	2013-07-31 06:56:48 -07:00
Matt Pharr	b6df447b55	Add reduce_add() for int8 and int16 types. This maps to specialized instructions (e.g. PSADBW) when available.	2013-07-25 09:46:01 -07:00
Matt Pharr	6412876f64	Remove unused __reduce_add_uint{32,64} target functions. The stdilb code just calls the signed int{32,64} functions, which gives the right result for the unsigned case anyway. The various targets didn't consistently define the unsigned variants in any case.	2012-09-28 05:55:41 -07:00
Jean-Luc Duprat	f0b0618484	Added the following mask tests: __any(), __all(), __none() for all supported targets. This allows for more efficient code generation of KNC.	2012-09-14 11:06:18 -07:00
Matt Pharr	984a68c3a9	Rename gen_gather() macro to gen_gather_factored()	2012-07-13 12:24:12 -07:00
Matt Pharr	19b46be20d	Remove load_and_broadcast from built-ins. Now that we never ever run with the mask all off, we no longer need that logic in a built-in function so that we can check the mask. In the one place where it was used (turning gathers to the same location into a load and broadcast), we now just emit the code for that directly.	2012-06-12 12:30:57 -07:00
Matt Pharr	89a2566e01	Add separate variants of memory built-ins for floats and doubles. Previously, we'd bitcast e.g. a vector of floats to a vector of i32s and then use the i32 variant of masked_load/masked_store/gather/scatter. Now, we have separate float/double variants of each of those.	2012-06-07 14:47:16 -07:00
Matt Pharr	1ac3e03171	Gather/scatter function improvements in builtins. More naming consistency: _i32 rather than i32, now. Also improved the m4 macros to generate these sequences to not require as many parameters.	2012-06-07 14:19:23 -07:00
Matt Pharr	b86d40091a	Improve naming of masked load/store instructions in builtins. Now, use _i32 suffixes, rather than _32, etc. Also cleaned up the m4 macro to generate these functions, using WIDTH to get the target width, etc.	2012-06-07 13:58:31 -07:00
Matt Pharr	91d22d150f	Update load_and_broadcast built-in Change function suffix to "_i32", etc, from "_32" Improve load_and_broadcast macro in util.m4 to grab vector width from WIDTH variable rather than taking it as a parameter.	2012-06-07 13:33:17 -07:00
Matt Pharr	1d29991268	Indentation fixes in builtins/	2012-06-07 13:23:07 -07:00
Matt Pharr	90db01d038	Represent MOVMSK'ed masks with int64s rather than int32s. This allows us to scale up to 64-wide execution.	2012-05-25 11:57:23 -07:00
Matt Pharr	1867b5b317	Use native float/half conversion instructions with the AVX2 target.	2012-01-24 15:33:38 -08:00
Matt Pharr	562d61caff	Added masked load optimization pass. This pass handles the "all on" and "all off" mask cases appropriately. Also renamed load_masked stuff in built-ins to masked_load for consistency with masked_store.	2012-01-04 11:51:26 -08:00
Matt Pharr	1d9201fe3d	Add "generic" 4, 8, and 16-wide targets. When used, these targets end up with calls to undefined functions for all of the various special vector stuff ispc needs to compile ispc programs (masked store, gather, min/max, sqrt, etc.). These targets are not yet useful for anything, but are a step toward having an option to C++ code with calls out to intrinsics. Reorganized the directory structure a bit and put the LLVM bitcode used to define target-specific stuff (as well as some generic built-ins stuff) into a builtins/ directory. Note that for building on Windows, it's now necessary to set a LLVM_VERSION environment variable (with values like LLVM_2_9, LLVM_3_0, LLVM_3_1svn, etc.)	2011-12-19 13:46:50 -08:00

33 Commits