Matt Pharr
d9c38b5c1f
Remove support for using SVML for math lib routines.
...
This path was poorly maintained and wasn't actually available on most
targets.
2013-07-31 06:56:48 -07:00
Matt Pharr
b6df447b55
Add reduce_add() for int8 and int16 types.
...
This maps to specialized instructions (e.g. PSADBW) when available.
2013-07-25 09:46:01 -07:00
Matt Pharr
2d063925a1
Explicitly call the PBLENDVB intrinsic for i8 blending with sse4-8.
...
This is slightly cleaner than trunc-ing the i8 mask to i1 and using
a vector select. (And is probably more safe in terms of good code.)
2013-07-25 09:46:01 -07:00
Matt Pharr
53414f12e6
Add SSE4 target optimized for computation with 8-bit datatypes.
...
This change adds a new 'sse4-8' target, where programCount is 16 and
the mask element size is 8-bits. (i.e. the most appropriate sizing of
the mask for SIMD computation with 8-bit datatypes.)
2013-07-23 17:30:32 -07:00