aaron/ispc - ispc - git.frat.tech

aaron/ispc

Author	SHA1	Message	Date
Matt Pharr	58e34ba4ae	Add new test-driver script, run_tests.py. Old run_tests.sh still lives (for now). Changes include: - Tests are run in parallel across all of the available CPU cores - Option to create a statically-linked executable for each test (rather than using the LLVM JIT). This is in particular useful for AVX, which doesn't have good JIT support yet. - Static executables also makes it possible to test x86, not just x86-64, codegen. - Fixed a number of tests in failing_tests, which were actually failing due to the fact that the expected function signature of tests had changed.	2011-08-29 14:15:09 -07:00
Matt Pharr	33feeffe5d	Update timing header so it works with C code	2011-08-29 11:23:43 -07:00
Matt Pharr	d0db46aac5	Use logical shift right op for shifts of unsigned ints. Fixes issue #88 .	2011-08-29 10:32:26 -07:00
Matt Pharr	da76396c75	Fix typo in SSE2 attributes string.	2011-08-27 08:59:25 -07:00
Matt Pharr	bbf3fb6307	Disable popcnt on SSE4 targets--should only enable if system CPU supports it	2011-08-27 04:09:55 -07:00
Matt Pharr	4ab982bc16	Various AVX fixes (found by inspection). Emit calls to masked_store, not masked_store_blend, when handling masked stores emitted by the frontend. Fix bug in binary8to16 macro in builtins.m4 Fix bug in 16-wide version of __reduce_add_float Remove blend function implementations for masked_store_blend for AVX; just forward those on to the corresponding real masked store functions.	2011-08-26 12:58:02 -07:00
Matt Pharr	34301e09f5	Fix incorrect comment in builtins definitions files. (And all of the places it was cut and pasted to. :-( ).	2011-08-26 10:44:46 -07:00
Matt Pharr	84e586e767	Commit correct atomics tests	2011-08-26 10:43:30 -07:00
Matt Pharr	72a2f5d2f4	Make SSE2 __popcnt_int64 return i64 to be consistent with other targets	2011-08-26 10:42:12 -07:00
Matt Pharr	606cbab0d4	Performance improvements for global min/max atomics. Issue #57 . Compute a "local" min/max across the active program instances and then do a single atomic memory op. Added a few tests to exercise global min/max atomics (which were previously untested!)	2011-08-26 10:35:24 -07:00
Matt Pharr	54ec56c81d	Clean up and centralize LLVM target initialization	2011-08-26 10:15:33 -07:00
Matt Pharr	a322398c62	When emitting header files, put 'extern' declarations of globals used in ispc code outside of the ispc namespace. Fixes issue #64.	2011-08-26 10:03:06 -07:00
Matt Pharr	f22b3a25bd	Update command-line processing and usage string now that we have a preprocessor on Windows. We had been prohibiting Windows users from providing #definitions on the command line, which is the wrong thing to do ever since we switched to using the clang preprocessor.	2011-08-26 09:58:08 -07:00
Matt Pharr	b67498766e	Big rewrite / improvement of target handling. If no CPU is specified, use the host CPU type, not just a default of "nehalem". Provide better features strings to the LLVM target machinery. -> Thus ensuring that LLVM doesn't generate SSE>2 instructions for the SSE2 target (Fixes issue #82). -> Slight code improvements from using cmovs in generated code now Use the llvm popcnt intrinsic for the SSE2 target now (it now generates code that doesn't call the popcnt instruction now that we properly tell LLVM which instructions are and aren't available for SSE2.)	2011-08-26 09:54:45 -07:00
Matt Pharr	c340ff3893	Fixes to build with LLVM ToT	2011-08-25 08:53:56 +01:00
Matt Pharr	b0f59777d4	Silly bug: don't pass NULL to the print() stmt when we want a llvm::Value * that has the value NULL. (This was causing crashes with print() statements with no additional values to be printed.)	2011-08-25 07:48:13 +01:00
Matt Pharr	e14208f489	Update to call DIBuilder::finalize() with LLVM 3.0	2011-08-24 22:28:20 +01:00
Matt Pharr	7756265503	Add double-pumped AVX target (i.e., run 16-wide). Not yet tested.	2011-08-20 11:28:22 +01:00
Matt Pharr	f841b775c3	Small bugfixes in AVX builtins	2011-08-20 09:09:55 +01:00
Matt Pharr	8c921544a0	fix broken test	2011-08-18 20:40:50 +01:00
Matt Pharr	fe54f1ad8e	Fixes to build with latest LLVM ToT	2011-08-18 08:34:49 +01:00
Matt Pharr	74c2c8ae07	Linux build fixes	2011-08-17 07:08:44 -07:00
Matt Pharr	87ec7aa10d	release notes, housekeeping for 1.0.6 release v1.0.6	2011-08-17 14:55:21 +01:00
Matt Pharr	206c851146	Various improvements to example task systems in examples/. - Only have a single copy of all of the tasks_*.cpp sample implementations, stored in examples/. - Reduce dynamic storage allocation and locking in task launch code paths. - Don't have a hard limit of the number of tasks that can be launched on Windows (fix issue #85).	2011-08-17 14:31:45 +01:00
Matt Pharr	60bdf1ef8a	Modify rt example to also do a set of runs with tasks + SPMD together.	2011-08-17 13:14:32 +01:00
Matt Pharr	d7662b3eb9	Use reduce_equal() in volume rendering example to avoid some gathers. Modified this example to use reduce_equal() to see if all of the program instances want to load the 8 sample values around the same voxel. When this is the case, we can just do 8 scalar loads, rather than needing to do a fully general gather. Once this check fails, it isn't done again, since it's not likely to start succeeding in the future. This gives a ~10% speedup with the low-res data set, and basically no performance difference with the high-res one. (It makes sense that the lower-resolution the voxel sampling, the longer all of the rays will stay in the same set of voxels.)	2011-08-17 12:37:07 +01:00
Matt Pharr	ecaa57c7c6	Add volume rendering example. (~2.3x speedup from SIMD vs serial code.)	2011-08-17 12:05:37 +01:00
Matt Pharr	fce183c244	Merge branch 'master' of github.com:ispc/ispc	2011-08-17 10:32:49 +01:00
Matt Pharr	7a92f8b3f9	Add MSVC build support for stencil example	2011-08-17 02:28:49 -07:00
Matt Pharr	96af08e789	Print notices about image files being written	2011-08-16 06:31:26 +01:00
Matt Pharr	cb29c10660	Fix tests on Windows: need arch=x86 since ispc_test.exe is a32-bit app	2011-08-15 08:25:08 -07:00
Matt Pharr	04c93043d6	Target handling fixes. Set the Module's target appropriately when it's first created. Compile separate 32 and 64 bit versions of the builtins-c bitcocde and load the appropriate one based on the target we're compiling for.	2011-08-15 16:03:50 +01:00
Matt Pharr	46037c7a11	Merge branch 'master' of github.com:ispc/ispc	2011-08-15 12:44:38 +01:00
Matt Pharr	c570108026	Fix linux build of stencil example	2011-08-15 04:44:17 -07:00
Matt Pharr	230a0fadea	Attempt to generate debug info for task parameters.	2011-08-15 12:31:56 +01:00
Matt Pharr	87cf05e0d2	Improve performance of 64-bit reduce_equal implementations. Just pulling out the elements and doing a set of scalar equality tests is the best approach for those (nearly 2x better than the rotate and vector equality check that we use for 32-bit stuff).	2011-08-14 07:39:05 +01:00
Matt Pharr	ff608eef71	Change reduce_equal to return false if no instances are executing	2011-08-14 07:11:45 +01:00
Matt Pharr	f868a63064	Add support for scan operations across program instances (add, and, or).	2011-08-13 20:11:41 +01:00
Matt Pharr	c74116aa24	Fix crasher with malformed program	2011-08-12 07:47:17 +01:00
Matt Pharr	8c534d4d74	Add reduce_equal() function to standard library.	2011-08-10 15:55:55 -07:00
Matt Pharr	d821a11c7c	Fix min/max for integer types with AVX.	2011-08-04 06:24:20 -07:00
Matt Pharr	8a138eeb5a	vim syntax highlighting for ispc from <andreas.wendleder@googlemail.com>	2011-08-04 05:49:28 -07:00
Matt Pharr	137ea7bde6	Rename semaphore filename to be more generic	2011-08-04 05:28:00 -07:00
Matt Pharr	e05b3981d9	Add stencil example	2011-08-03 13:49:02 -07:00
Matt Pharr	a5a133ccce	Do more iterations of RNG test to let result converge to bounds.	2011-08-03 13:44:49 -07:00
Matt Pharr	0ac4f7b620	Add various prefetch functions to the standard library.	2011-08-03 13:31:45 -07:00
Matt Pharr	467f1e71d7	Add fast versions of the float<-->half conversion routines in the stdlib. These get slightly wrong results for zero and the denorms and also don't handle the Inf/NaN stuff correctly, but are much more efficient than the full versions of these routines.	2011-08-03 15:58:42 +01:00
Matt Pharr	a2996ed5d9	More efficient implementation of frandom() in stdlib	2011-08-03 14:28:06 +01:00
Matt Pharr	7d7dd2b204	Merge branch 'master' of github.com:ispc/ispc	2011-08-01 12:16:33 +01:00
Matt Pharr	172794ba5f	Release notes and doxygen update for 1.0.5 release v1.0.5	2011-08-01 12:15:42 +01:00

1 2 3 4

173 Commits