Commit Graph

12 Commits

Author SHA1 Message Date
Matt Pharr
c86128e8ee AVX: go back to using blend (vs. masked store) when possible.
All of the masked store calls were inhibiting putting values into
registers, which in turn led to a lot of unnecessary stack traffic.
This approach seems to give better code in the end.
2011-09-07 11:26:49 -07:00
Matt Pharr
4f451bd041 More AVX fixes
Fix RNG state initialization for 16-wide targets
Fix a number of bugs in reduce_add builtin implementations for AVX.
Fix some tests that had incorrect expected results for the 16-wide
  case.
2011-09-06 15:53:11 -07:00
Matt Pharr
08cad7a665 AVX bugfixes 2011-09-01 14:23:10 -07:00
Matt Pharr
6de494cfdb Fix AVX bug introduced in 4ab982bc16 2011-08-29 16:50:59 -07:00
Matt Pharr
4ab982bc16 Various AVX fixes (found by inspection).
Emit calls to masked_store, not masked_store_blend, when handling
  masked stores emitted by the frontend.
Fix bug in binary8to16 macro in builtins.m4
Fix bug in 16-wide version of __reduce_add_float
Remove blend function implementations for masked_store_blend for
  AVX; just forward those on to the corresponding real masked store
  functions.
2011-08-26 12:58:02 -07:00
Matt Pharr
34301e09f5 Fix incorrect comment in builtins definitions files.
(And all of the places it was cut and pasted to. :-( ).
2011-08-26 10:44:46 -07:00
Matt Pharr
7756265503 Add double-pumped AVX target (i.e., run 16-wide). Not yet tested. 2011-08-20 11:28:22 +01:00
Matt Pharr
f841b775c3 Small bugfixes in AVX builtins 2011-08-20 09:09:55 +01:00
Matt Pharr
f868a63064 Add support for scan operations across program instances (add, and, or). 2011-08-13 20:11:41 +01:00
Matt Pharr
8c534d4d74 Add reduce_equal() function to standard library. 2011-08-10 15:55:55 -07:00
Matt Pharr
d821a11c7c Fix min/max for integer types with AVX. 2011-08-04 06:24:20 -07:00
Matt Pharr
a552927a6a Cleanup implementation of target builtins code.
- Renamed stdlib-sse.ll to builtins-sse.ll (etc.) in an attempt to better indicate
the fact that the stuff in those files has a role beyond implementing stuff for
the standard library.
- Moved declarations of the various __pseudo_* functions from being done with LLVM
API calls in builtins.cpp to just straight up declarations in LLVM assembly
language in builtins.m4.  (Much less code to do it this way, and more clear what's
going on.)
2011-08-01 05:58:43 +01:00