Fixed a number of tests that didn't handle the programCount == 1 case correctly.
Fix RNG state initialization for 16-wide targets Fix a number of bugs in reduce_add builtin implementations for AVX. Fix some tests that had incorrect expected results for the 16-wide case.