aaron/ispc - ispc - git.frat.tech

aaron/ispc

Author	SHA1	Message	Date
Vsevolod Livinskiy	eb61d5df72	Support for cache 2/3 and all targets	2014-10-02 16:25:23 +04:00
Evghenii	688d9c9a82	added support for rsqrtd/rcpd for generic-*.h	2014-02-05 13:20:44 +01:00
Ilia Filippov	15816eb07e	adding __packed_store_active2 to generic targets	2013-12-19 17:50:18 +04:00
Matt Pharr	2b2905b567	Fix (preexisting) bugs in generic-32/64.h with type of "__any", etc. This should be a bool, not a one-wide vector of bools. The equivalent fix was previously made in generic-16.h, but not made here. (Note that many tests are still failing with these targets, but at least they compile properly now.)	2013-08-20 09:05:50 -07:00
Matt Pharr	e7f067d70c	Fix handling of __clock() builtin for "generic" targets.	2013-08-20 09:04:52 -07:00
Matt Pharr	b6df447b55	Add reduce_add() for int8 and int16 types. This maps to specialized instructions (e.g. PSADBW) when available.	2013-07-25 09:46:01 -07:00
Jean-Luc Duprat	f0b0618484	Added the following mask tests: __any(), __all(), __none() for all supported targets. This allows for more efficient code generation of KNC.	2012-09-14 11:06:18 -07:00
Jean-Luc Duprat	aecd6e0878	All the smear(), setzero() and undef() APIs are now templated on the return type. Modified ISPC's internal mangling to pass these through unchanged. Tried hard to make sure this is not going to introduce an ABI change.	2012-07-17 17:06:36 -07:00
Matt Pharr	216ac4b1a4	Stop factoring out constant offsets for gather/scatter if instr is available. For KNC (gather/scatter), it's not helpful to factor base+offsets gathers and scatters into base_ptr + {1/2/4/8} * varying_offsets + const_offsets. Now, if a HW instruction is available for gather/scatter, we just factor into base + {1/2/4/8} * offsets (if possible). Not only is this simpler, but it's also what we need to pass a value along to the scale by 2/4/8 available directly in those instructions. Finishes issue #325.	2012-07-11 14:52:29 -07:00
Matt Pharr	ec0280be11	Rename gather/scatter_base_offsets functions to factored_based_offsets. No functional change; just preparation for having a path that doesn't factor the offsets into constant and varying parts, which will be better for AVX2 and KNC.	2012-07-11 14:16:39 -07:00
Jean-Luc Duprat	bea88ab122	Integrated changes from mmp/and-fold-opt: Add peephole optimization to eliminate some mask AND operations. On KNC, the various vector comparison instructions can optionally be masked; if a mask is provided, the result is effectively that the value returned is the AND of the mask with the result of the comparison. This change adds an optimization pass to the C++ backend that looks for vector ANDs where one operand is a comparison and rewrites them--e.g. "and(equalfloat(a, b), c)" is changed to "_equal_float_and_mask(a, b, c)", saving an instruction in the end. Issue #319. Merge commit '8ef6bc16364d4c08aa5972141748110160613087' Conflicts: examples/intrinsics/knc.h examples/intrinsics/sse4.h	2012-07-10 10:33:24 -07:00
Matt Pharr	bc7775aef2	Fix __ordered and _unordered floating point functions for C++ target. Fixes include adding "_float" and "_double" suffixes as appropriate as well as providing a number of missing implementations. This fixes a number of failures in the half* tests.	2012-07-09 14:35:51 -07:00
Jean-Luc Duprat	516ba85abd	Merge pull request #322 from mmp/vector-constants Vector constants	2012-07-09 09:28:26 -07:00
Jean-Luc Duprat	098277b4f0	Merge pull request #321 from mmp/setzero More varied support for constant vectors from C++ backend.	2012-07-09 08:57:05 -07:00
Matt Pharr	8ef6bc1636	Add peephole optimization to eliminate some mask AND operations. On KNC, the various vector comparison instructions can optionally be masked; if a mask is provided, the result is effectively that the value returned is the AND of the mask with the result of the comparison. This change adds an optimization pass to the C++ backend that looks for vector ANDs where one operand is a comparison and rewrites them--e.g. "__and(__equal_float(a, b), c)" is changed to "__equal_float_and_mask(a, b, c)", saving an instruction in the end. Issue #319.	2012-07-07 08:35:38 -07:00
Matt Pharr	974b40c8af	Add type suffix to comparison ops in C++ output. e.g. "__equal()" -> "__equal_float()", etc. No functional change; this is necessary groundwork for a forthcoming peephole optimization that eliminates ANDs of masks in some cases.	2012-07-07 07:50:59 -07:00
Matt Pharr	e5fe0eabdc	Update __load() builtins to take const pointers.	2012-07-06 08:47:47 -07:00
Matt Pharr	0d3993fa25	More varied support for constant vectors from C++ backend. If we have a vector of all zeros, a __setzero_* function call is emitted, permitting calling specialized intrinsics for this. Undefined values are reflected with an __undef_* call, which similarly allows passing that information along. This change also includes a cleanup to the signature of the __smear_* functions; since they already have different names depending on the scalar value type, we don't need to use the trick of passing an undefined value of the return vector type as the first parameter as an indirect way to overload by return value. Issue #317.	2012-07-05 20:19:11 -07:00
Jean-Luc Duprat	e431b07e04	Changed the C API to use templates to indicate memory alignment to the C compiler This should help with performance of the generated code. Updated the relevant header files (sse4.h, generic-16.h, generic-32.h, generic-64.h) Updated generic-32.h and generic-64.h to the new memory API	2012-06-28 09:29:15 -07:00
Matt Pharr	21c43737fe	Fix bug in examples/intrinsics/generic-32.h	2012-05-25 14:27:30 -07:00
Matt Pharr	7a2142075c	Add examples/intrinsics/generic-32.h implementation. Roughly 100 tests fail with this; all the tests need to be audited for assumptions that 16 is the widest width possible…	2012-05-25 12:37:59 -07:00

21 Commits