Commit Graph

5 Commits

Author SHA1 Message Date
Matt Pharr
a9540b7c18 Update implementations of masked load/store builtins for AVX to actually use the AVX intrinsics that do this. (As always, not yet tested, pending fuller LLVM AVX support.) 2011-07-01 16:27:49 +01:00
Matt Pharr
c6bbfe8b54 Many fixes to AVX builtins implementations. (Found by inspection, still not working pending further LLVM support for AVX.)
- Call SSE versions for all the various scalar intrinsics
- Fix names of many (all?) AVX intrinsics; all were missing .256 suffix, others had additional issues.
2011-07-01 16:20:03 +01:00
Matt Pharr
86de910ecd Improve implementation of __masked_store_blend_64() for AVX target by doing two 8-wide 32-bit blends rather than serializing. Fixes issue #29 2011-06-28 20:52:06 -07:00
Matt Pharr
b84167dddd Fixed a number of issues related to memory alignment; a number of places
were expecting vector-width-aligned pointers where in point of fact,
there's no guarantee that they would have been in general.

Removed the aligned memory allocation routines from some of the examples;
they're no longer needed.

No perf. difference on Core2/Core i5 CPUs; older CPUs may see some
regressions.

Still need to update the documentation for this change and finish reviewing
alignment issues in Load/Store instructions generated by .cpp files.
2011-06-23 18:18:33 -07:00
Matt Pharr
18af5226ba Initial commit. 2011-06-21 12:48:50 -07:00