If a flag along the lines of "--target=sse4,avx-x2" is provided on the command-line,
then the program will be compiled for each of the given targets, with a separate
output file generated for each one. Further, an output file with dispatch functions
that check the current system's CPU and then chooses the best available variant
is also created.
Issue #11.
Old run_tests.sh still lives (for now).
Changes include:
- Tests are run in parallel across all of the available CPU cores
- Option to create a statically-linked executable for each test
(rather than using the LLVM JIT). This is in particular useful
for AVX, which doesn't have good JIT support yet.
- Static executables also makes it possible to test x86, not
just x86-64, codegen.
- Fixed a number of tests in failing_tests, which were actually
failing due to the fact that the expected function signature of
tests had changed.