Introduced knc2x.h which supprts 2x interleaved code generation for KNC (use the target generic-32). This implementation is even more experimental and incomplete than knc.h but is useful already (mandelbrot works for example) knc.h: Switch to new intrinsic names _mm512_set_1to16_epi32() -> _mm512_set1_epi32(), etc... Fix the declaration of the unspecialized template for __smear_*(), __setzero_*(), __undef_*() Specifically mark _mm512_undefined_*() a few vectors in __load<>() Fixed implementations of some implementations of __smear_*(), __setzero_*(), __undef_*() to remove unecessary dependent instructions. Implemented ISPC reductions by simply calling existing intrinsic reductions, which are slightly more efficient than our precendent implementation. Also added reductions for double types.
71 KiB
71 KiB