Commit Graph

98 Commits

Author SHA1 Message Date
james.brodman
9f933b500b Add missing __cast_sext(__vec16_i32,__vec16_i1) 2013-12-20 16:45:27 -05:00
Ilia Filippov
15816eb07e adding __packed_store_active2 to generic targets 2013-12-19 17:50:18 +04:00
evghenii
32cfdd52d3 Merge branch 'master' into knc-fix 2013-11-05 15:46:54 +01:00
evghenii
015af03bdc changed back to #define ISPC_FORCE_ALIGNED_MEMORY aligned_ld/st #else unaligned ld/st #endif. However load<64>/store<64> will still be unaliged w/o this define because of fails related to the issue #632 2013-11-05 15:41:14 +01:00
Dmitry Babokin
6585a925be Merge pull request #641 from jbrodman/stdlibshift
Add a "shift" operator to the stdlib.
2013-10-28 14:18:31 -07:00
james.brodman
e682b19eda Remove zero initialization for __vec4_i32 2013-10-28 17:13:07 -04:00
james.brodman
02681d531e Minor tweak for interface. 2013-10-28 12:56:43 -04:00
james.brodman
641d882ea6 Add shift support for knc targets. This is not optimized. 2013-10-28 12:43:42 -04:00
james.brodman
1e80b3b0d7 Add shift support for generic-16 target. 2013-10-28 12:20:32 -04:00
james.brodman
d2b89e0e37 Tweak generic target. 2013-10-23 18:01:01 -04:00
james.brodman
c4ad8f6ed4 Add docs/generic impls 2013-10-23 15:51:59 -04:00
evghenii
fb1a2a0a40 __masked_store_* uses vscatter now, and is thread-safe 2013-10-15 17:10:46 +03:00
evghenii
3da152a150 fixed zmm __mul for i64 with icc < 14.0.0, 4 knc::fails lefts, but I doubt these are due to this include.. 2013-10-07 18:30:22 +03:00
evghenii
4222605f87 fixed lshr/ashr/shl shifts. __mul i64 vector version for icc < 14.0.0 works only on signed, so commented it out in favour of sequential 2013-10-07 14:24:27 +03:00
evghenii
1b196520f6 knc-i1x16.h is cleaned: int32,float,double are complete, int64 is partially complete 2013-10-05 22:10:05 +03:00
evghenii
10223cfac3 workong on shuffle/rotate for double, there seems to be a bug in cvt2zmm cvt2hilo 2013-10-05 15:23:55 +03:00
evghenii
8b0fc558cb complete cleaning 2013-10-05 14:15:33 +03:00
evghenii
8a6789ef61 cleaned float added fails info 2013-10-04 14:11:09 +03:00
evghenii
57f019a6e0 cleaned int64 added fails info 2013-10-04 13:39:15 +03:00
evghenii
32c77be2f3 cleaned mask & int32, only test141 fails 2013-10-04 11:42:52 +03:00
james.brodman
dc8895352a Adding missing typecasts and guarding i64 __mul with compiler version check 2013-10-01 11:53:56 -04:00
evghenii
019043f55e patched half2float & float2half to pass the tests. Now only test-141 is failed. but it seems to be test rather than knc-i1x16.h related 2013-09-23 09:55:55 +03:00
evghenii
ddecdeb834 move remaining int64 from knc.h some of fails to pass tests, grep for evghenii::fails to find out which functions fail and on what tests 2013-09-20 14:55:15 +03:00
evghenii
5cabf0bef0 adding int64 support form knc.h, phase 1. bugs: __lshr & __ashr fail idiv.ispc test, __equal_i64 & __equal_i64_and_mask fails reduce_equal_8.ispc test 2013-09-20 14:13:40 +03:00
evghenii
0ed89e93fa added fails info 2013-09-19 16:34:06 +03:00
evghenii
0c274212c2 performance tuning for knc-i1x8.h. this gives goed enough performance for double only. float performance is terrible 2013-09-19 16:07:22 +03:00
evghenii
dbef4fd7d7 fixed notation 2013-09-19 14:52:22 +03:00
evghenii
6a21218c13 fix warrning and add KNC 1 2013-09-19 13:45:31 +03:00
evghenii
3cf63362a4 small tuning 2013-09-18 20:03:08 +03:00
evghenii
e4b1f58595 performance fix.. still some issues left with equal_i1 for __vec8_i1 2013-09-18 19:14:41 +03:00
evghenii
4b1a0b4bc4 added fails 2013-09-18 18:41:22 +03:00
evghenii
922edb1128 completed knc-i1x16.h and added knc-i1x8.h with knc-i1x8unsafe_fast.h that doesnt pass several tests.. 2013-09-18 18:14:07 +03:00
Matt Pharr
2b2905b567 Fix (preexisting) bugs in generic-32/64.h with type of "__any", etc.
This should be a bool, not a one-wide vector of bools.  The equivalent
fix was previously made in generic-16.h, but not made here.  (Note that
many tests are still failing with these targets, but at least they
compile properly now.)
2013-08-20 09:05:50 -07:00
Matt Pharr
e7f067d70c Fix handling of __clock() builtin for "generic" targets. 2013-08-20 09:04:52 -07:00
Matt Pharr
7ab4c5391c Fix build with LLVM 3.2 and generic-4 / examples/sse4.h target. 2013-08-09 19:56:43 -07:00
Matt Pharr
b6df447b55 Add reduce_add() for int8 and int16 types.
This maps to specialized instructions (e.g. PSADBW) when available.
2013-07-25 09:46:01 -07:00
james.brodman
6211966c55 Change mask to use __mmask16 instead of a struct. 2013-05-30 16:04:44 -04:00
james.brodman
7b2eaf63af knc.h cleanup 2013-05-10 13:36:18 -04:00
Dmitry Babokin
1069a3c77e Removing some sources of warnings sse4.h and trailing spaces 2013-04-25 03:40:32 +04:00
james.brodman
52dcbf087a Implemented 3 more intrinsics on double precision vectors 2013-03-28 11:55:53 -04:00
james.brodman
ef1af547e2 Change sse4.h to enable inlining. 2013-03-13 10:55:53 -04:00
Jean-Luc Duprat
24087ff3cc Expose none() in the ISPC standard library.
On KNC: all(), any() and none() do not generate a redundant movmsk instruction.
2012-11-27 13:38:28 -08:00
Jean-Luc Duprat
2129b1e27d knc.h: Fixed __rsqrt_varying_float() to use _mm512_invsqrt_ps() instead of _mm512_invsqrt_pd()
This was a typo.
2012-11-21 15:40:35 -08:00
Jean-Luc Duprat
d3b86dcc90 KNC: fix implementation of __all() to use KNCni mask test instructions... 2012-11-14 09:24:01 -08:00
Jean-Luc Duprat
b601331362 Approximation for inverse sqrt and reciprocal provided in fast math mode.
RCP was actually slow in fast math mode
   Inverse sqrt did not expose fast approximation
2012-11-13 14:01:35 -08:00
james.brodman
97ddc1ed10 Fixed =/== error in __all() 2012-11-08 16:30:12 -05:00
jbrodman
e323b1d0ad Fixed compile error: == instead of = 2012-10-26 16:55:28 -04:00
Matt Pharr
406fbab40e Fix bugs in declarations of __any, __all, and __none in examples/intrinsics.
They return bool, not vector of bool.
2012-10-17 10:55:50 -07:00
Jean-Luc Duprat
3dd9ff3d84 knc.h:
Properly pick up on ISPC_FORCE_ALIGNED_MEMORY when --opt=force-aligned-memory is used
	Fixed usage of loadunpack and packstore to use proper memory offset
	Fixed implementation of __masked_load_*() __masked_store_*() incorrectly (un)packing the lanes loaded
	Cleaned up usage of _mm512_undefined_*(), it is now mostly confined to constructor
	Minor cleanups

knc2x.h
	Fixed usage of loadunpack and packstore to use proper memory offset
	Fixed implementation of __masked_load_*() __masked_store_*() incorrectly (un)packing the lanes loaded
	Properly pick up on ISPC_FORCE_ALIGNED_MEMORY when --opt=force-aligned-memory is used
	__any() and __none() speedups.
	Cleaned up usage of _mm512_undefined_*(), it is now mostly confined to constructor
2012-09-19 17:11:04 -07:00
Ingo Wald
7f386923b0 Merge branch 'master' of https://github.com/ispc/ispc 2012-09-17 15:54:25 +02:00