Commit Graph

72 Commits

Author SHA1 Message Date
Dmitry Babokin
c7606bb93b Merge pull request #623 from ifilippov/testing
adding --extra option and correction pathes to ispc compiler
2013-10-10 04:52:22 -07:00
Ilia Filippov
0d9594354a adding --extra option and correction pathes to ispc compiler 2013-10-10 15:38:08 +04:00
Dmitry Babokin
959a741c14 Merge pull request #612 from dbabokin/missing_target
Adding support for avx1-i64x4 target in test system.
2013-10-01 07:43:26 -07:00
Dmitry Babokin
4a908cf38d Merge pull request #614 from ifilippov/testing
pipe correction and some other small changes in test system (fixes problems with alloy.py on MacOS)
2013-10-01 07:35:48 -07:00
Ilia Filippov
b2cf0209b1 pipe correction and some other small changes in test system 2013-10-01 18:01:29 +04:00
Dmitry Babokin
758efebb3c Add missing testing support for avx1-i64x4 target 2013-09-30 17:54:59 +04:00
jbrodman
39c2274f1a Merge pull request #588 from egaburov/knc-modes
Added knc-i1x16.h , knc-i1x8.h and knc-i1x8unsafe_fast.h
2013-09-27 11:20:56 -07:00
Ilia Filippov
1c858c34f7 correction of test system 2013-09-26 14:54:15 +04:00
Ilia Filippov
af5da885a5 small corrections of test system 2013-09-23 18:18:48 +04:00
egaburov
d68dbbc7bc Merge remote-tracking branch 'upstream/master' into knc-modes 2013-09-19 15:08:17 +02:00
Ilia Filippov
00cd90c6b0 test system 2013-09-19 12:26:57 +04:00
evghenii
922edb1128 completed knc-i1x16.h and added knc-i1x8.h with knc-i1x8unsafe_fast.h that doesnt pass several tests.. 2013-09-18 18:14:07 +03:00
Dmitry Babokin
3f2217646e Merge pull request #562 from mmp/arm
New target naming scheme, new targets (SSE4-i8x16 and SSE4-i16x8), plus some cleanup and improvements.
2013-08-22 08:33:25 -07:00
Dmitry Babokin
f31a31478b Moving time calculation earlier 2013-08-22 12:41:57 +04:00
Dmitry Babokin
5fb30939be Fix for #564, using wrong ispc in run_tests.py 2013-08-21 19:46:18 +04:00
Dmitry Babokin
60b413a9cb Adding --non-interactive switch to run_tests.py 2013-08-21 19:25:30 +04:00
Matt Pharr
0c5742b6f8 Implement new naming scheme for --target.
Now targets are named like "<isa>-i<mask size>x<gang size>", e.g.
"sse4-i8x16", or "avx2-i32x16".

The old target names are still supported.
2013-08-08 19:23:44 -07:00
Matt Pharr
ab3b633733 Add 8-bit and 16-bit specialized NEON targets.
Like SSE4-8 and SSE4-16, these use 8-bit and 16-bit values for mask
elements, respectively, and thus should generate the best code when used
for computation with datatypes of those sizes.
2013-07-30 08:44:16 -07:00
Matt Pharr
780b0dfe47 Add SSE4-16 target.
Along the lines of sse4-8, this is an 8-wide target for SSE4, using
16-bit elements for the mask.  It's thus (in principle) the best
target for SIMD computation with 16-bit datatypes.
2013-07-25 09:46:01 -07:00
Matt Pharr
f7f281a256 Choose type for integer literals to match the target mask size (if possible).
On a target with a 16-bit mask (for example), we would choose the type
of an integer literal "1024" to be an int16.  Previously, we used an int32,
which is a worse fit and leads to less efficient code than an int16
on a 16-bit mask target.  (However, we'd still give an integer literal
1000000 the type int32, even in a 16-bit target.)

Updated the tests to still pass with 8 and 16-bit targets, given this
change.
2013-07-23 17:24:50 -07:00
Matt Pharr
d7b0c5794e Add support for ARM NEON targets.
Initial support for ARM NEON on Cortex-A9 and A15 CPUs.  All but ~10 tests
pass, and all examples compile and run correctly.  Most of the examples
show a ~2x speedup on a single A15 core versus scalar code.

Current open issues/TODOs
- Code quality looks decent, but hasn't been carefully examined.  Known
  issues/opportunities for improvement include:
  - fp32 vector divide is done as a series of scalar divides rather than
    a vector divide (which I believe exists, but I may be mistaken.)
    This is particularly harmful to examples/rt, which only runs ~1.5x
    faster with ispc, likely due to long chains of scalar divides.
  - The compiler isn't generating a vmin.f32 for e.g. the final scalar
    min in reduce_min(); instead it's generating a compare and then a
    select instruction (and similarly elsewhere).
  - There are some additional FIXMEs in builtins/target-neon.ll that
    include both a few pieces of missing functionality (e.g. rounding
    doubles) as well as places that deserve attention for possible
    code quality improvements.

- Currently only the "cortex-a9" and "cortex-15" CPU targets are
  supported; LLVM supports many other ARM CPUs and ispc should provide
  access to all of the ones that have NEON support (and aren't too
  obscure.)

- ~5 of the reduce-* tests hit an assertion inside LLVM (unfortunately
   only when the compiler runs on an ARM host, though).

- The Windows build hasn't been tested (though I've tried to update
  ispc.vcxproj appropriately).  It may just work, but will more likely
  have various small issues.)

- Anything related to 64-bit ARM has seen no attention.
2013-07-19 23:07:24 -07:00
Ilia Filippov
cc32d913a0 replacement of qsize due to it's fails on MacOS 2013-06-25 16:27:25 +04:00
Ilia Filippov
8642b4d89f changing run_tests to support skipping tests and time 2013-06-13 19:25:34 +04:00
Ilia Filippov
6fb70c307d changing run_tests to support skipping tests and time 2013-06-13 19:00:02 +04:00
Ilia Filippov
d08346fbcf changes to support skipping tests 2013-06-13 16:47:10 +04:00
jbrodman
2027a6ac12 Merge pull request #483 from dbabokin/win_rm_files
Fix for removing temp files on Windows
2013-04-30 11:03:48 -07:00
Dmitry Babokin
9cd84aeea9 Fix for removing temp files on Windows 2013-04-25 22:50:37 +02:00
Dmitry Babokin
cbb0d6ce06 Don't run many threads when only one test is specified 2013-04-25 21:12:16 +04:00
Dmitry Babokin
14fe987956 More portable way of doing print in run_tests.py 2013-04-24 17:11:30 +02:00
Dmitry Babokin
c5acf239f2 Pass lock as a parameter to subprocesses to make task counter work on Windows 2013-04-24 01:14:46 +02:00
Dmitry Babokin
a02500b112 Make update look good in standard 80 char terminal (print in single line).
Issue a message about used compiler only once on Windows.
2013-04-24 00:21:18 +02:00
Dmitry Babokin
8fea85a85c Pass total number of tests as expicit parameter to subprocesses, so it works on Windows 2013-04-23 23:47:21 +02:00
Dmitry Babokin
6f42bfc640 Fixing native testing on Windows
All temporary files are stored in tmp* directories, including generic targets
Generic target are handled correctly on Windows now (still fail for different reasons)
2013-04-23 22:47:38 +02:00
Dmitry Babokin
f76eb2b7f5 Merge pull request #460 from Vsevolod-Livinskij/master
Fix for issue #453
2013-04-04 08:59:49 -07:00
Vsevolod Livinskij
4ea08116b8 Issue #453: now run_tests.py checks ispc_exe availability 2013-04-03 18:04:11 +04:00
Vsevolod Livinskij
78e03c6402 Issue #453: now run_tests.py checks ispc_exe availability otherwise prints an error message and exits 2013-04-03 16:51:16 +04:00
Vsevolod Livinskij
4ab89de343 Issue #453: now run_tests.py checks ispc_exe availability otherwise prints an error message 2013-04-03 02:58:16 +04:00
Vsevolod Livinskij
6db460fb81 Issue #453: now run_tests.py checks ispc_exe availability otherwise prints an error message 2013-04-03 02:34:42 +04:00
Dmitry Babokin
be859df51e Fix for #457 - issue with compiler Unicode output 2013-04-03 02:23:06 +04:00
Vsevolod Livinskij
9e0425e824 Checks the required compiler otherwise prints an error message and exits program 2013-03-29 18:42:02 +04:00
Vsevolod Livinskij
2960479095 Issue 2013-03-29 18:30:23 +04:00
Matt Pharr
9a1932eaf7 Only set gcc's "-msse4.2", etc, option when compiling for generic targets.
We don't need it when ispc is just generating an object file directly, and gcc
on OS X doesn't recognize -mavx.
2012-07-13 12:02:05 -07:00
Jean-Luc Duprat
3c070e5e20 run_tests.py will only attempt to use the -mmic flag when the knc.h header is used 2012-07-10 17:07:56 -07:00
Jean-Luc Duprat
dde599f48f run_tests.py now picks the ISA via a -m flag based on the target selected, rather than always picking -msse4.2;
this is needed because -msse4.2 is not supported on KNC.
2012-07-10 16:39:18 -07:00
Matt Pharr
7a2142075c Add examples/intrinsics/generic-32.h implementation.
Roughly 100 tests fail with this; all the tests need to be audited
for assumptions that 16 is the widest width possible…
2012-05-25 12:37:59 -07:00
Matt Pharr
15ea0af687 Add -f option to run_tests.py
This allows providing additional command-line arguments to ispc,
e.g. to force compilation with -O1, -g, etc.
2012-05-05 15:47:24 -07:00
Matt Pharr
d99bd279e8 Add generic-32 target. 2012-05-03 11:11:06 -07:00
Matt Pharr
e4b3d03da5 When available, use ANSI escapes to colorize diagnostic output.
Issue #245.
2012-04-19 11:36:28 -07:00
Matt Pharr
0575b1f38d Update run_tests and examples makefile for scalar target.
Fixed a number of tests that didn't handle the programCount == 1
case correctly.
2012-01-29 16:22:25 -08:00
Matt Pharr
1acf4032c2 Merge branch 'master' of https://github.com/jduprat/ispc 2012-01-26 14:18:25 -08:00