Commit Graph

4 Commits

Author SHA1 Message Date
Matt Pharr
e156651190 Fix __load_masked_{32,64} to properly obey the mask. Fixes issue #28.
Fixed the implementations of these builtin functions for targets that don't have native masked load instructions so that they do no loads if the vector mask is all off, and only do an (unaligned) vector load if both the first and last element of the mask are on.  Otherwise they serialize and do scalar loads for only the active lanes.  This fixes a number of potential sources of crashes due to accessing invalid memory.
2011-07-08 11:21:11 +01:00
Matt Pharr
5a53a43ed0 Finish support for 64-bit types in stdlib. Fixes issue #14.
Add much more suppport for doubles and in64 types in the standard library, basically supporting everything for them that are supported for floats and int32s.  (The notable exceptions being the approximate rcp() and rsqrt() functions, which don't really have sensible analogs for doubles (or at least not built-in instructions).)
2011-07-07 13:25:55 +01:00
Matt Pharr
b84167dddd Fixed a number of issues related to memory alignment; a number of places
were expecting vector-width-aligned pointers where in point of fact,
there's no guarantee that they would have been in general.

Removed the aligned memory allocation routines from some of the examples;
they're no longer needed.

No perf. difference on Core2/Core i5 CPUs; older CPUs may see some
regressions.

Still need to update the documentation for this change and finish reviewing
alignment issues in Load/Store instructions generated by .cpp files.
2011-06-23 18:18:33 -07:00
Matt Pharr
18af5226ba Initial commit. 2011-06-21 12:48:50 -07:00