aaron/ispc - ispc - git.frat.tech

aaron/ispc

Author	SHA1	Message	Date
Ilia Filippov	f620cdbaa1	Changes in perf.py functionality, unification of examples, correction build warnings	2013-08-26 14:04:59 +04:00
Matt Pharr	d7b0c5794e	Add support for ARM NEON targets. Initial support for ARM NEON on Cortex-A9 and A15 CPUs. All but ~10 tests pass, and all examples compile and run correctly. Most of the examples show a ~2x speedup on a single A15 core versus scalar code. Current open issues/TODOs - Code quality looks decent, but hasn't been carefully examined. Known issues/opportunities for improvement include: - fp32 vector divide is done as a series of scalar divides rather than a vector divide (which I believe exists, but I may be mistaken.) This is particularly harmful to examples/rt, which only runs ~1.5x faster with ispc, likely due to long chains of scalar divides. - The compiler isn't generating a vmin.f32 for e.g. the final scalar min in reduce_min(); instead it's generating a compare and then a select instruction (and similarly elsewhere). - There are some additional FIXMEs in builtins/target-neon.ll that include both a few pieces of missing functionality (e.g. rounding doubles) as well as places that deserve attention for possible code quality improvements. - Currently only the "cortex-a9" and "cortex-15" CPU targets are supported; LLVM supports many other ARM CPUs and ispc should provide access to all of the ones that have NEON support (and aren't too obscure.) - ~5 of the reduce-* tests hit an assertion inside LLVM (unfortunately only when the compiler runs on an ARM host, though). - The Windows build hasn't been tested (though I've tried to update ispc.vcxproj appropriately). It may just work, but will more likely have various small issues.) - Anything related to 64-bit ARM has seen no attention.	2013-07-19 23:07:24 -07:00
Matt Pharr	54459255d4	Add unmasked { } statement. This reestablishes an "all on" execution mask for the gang, which can be useful for nested parallelism..	2012-06-22 14:30:58 -07:00
Matt Pharr	e3341176c5	Redo makefiles for the examples. They're all based off a common examples/common.mk file, so that individual makefiles are quite simple now. The common.mk file also provides targets to build the examples using C++ output with the generic-16h or sse4.h files. These targets don't run by default, but do run if 'make all' is run.	2012-01-04 12:59:03 -08:00
Matt Pharr	8bc7367109	Add foreach and foreach_tiled looping constructs These make it easier to iterate over arbitrary amounts of data elements; specifically, they automatically handle the "ragged extra bits" that come up when the number of elements to be processed isn't evenly divided by programCount. TODO: documentation	2011-11-30 13:17:31 -08:00
Matt Pharr	975db80ef6	Add support for pointers to the language. Pointers can be either uniform or varying, and behave correspondingly. e.g.: "uniform float * varying" is a varying pointer to uniform float data in memory, and "float * uniform" is a uniform pointer to varying data in memory. Like other types, pointers are varying by default. Pointer-based expressions, & and *, sizeof, ->, pointer arithmetic, and the array/pointer duality all bahave as in C. Array arguments to functions are converted to pointers, also like C. There is a built-in NULL for a null pointer value; conversion from compile-time constant 0 values to NULL still needs to be implemented. Other changes: - Syntax for references has been updated to be C++ style; a useful warning is now issued if the "reference" keyword is used. - It is now illegal to pass a varying lvalue as a reference parameter to a function; references are essentially uniform pointers. This case had previously been handled via special case call by value return code. That path has been removed, now that varying pointers are available to handle this use case (and much more). - Some stdlib routines have been updated to take pointers as arguments where appropriate (e.g. prefetch and the atomics). A number of others still need attention. - All of the examples have been updated - Many new tests TODO: documentation	2011-11-27 13:09:59 -08:00
Matt Pharr	ce7355f9ed	Windows: fix examples build to look for ispc.exe in ../.. as well	2011-10-09 07:40:18 -07:00
Matt Pharr	bedaec2295	Update examples for multi-target compilation. Makefile and vcxproj file updates. Also modified vcxproj files so that the various files ispc generates go into $(TargetDir), not the current directory. Modified the ray tracer example to not have uniform short-vector types in its app-visible datatypes (these are laid out differently on SSE vs AVX); there was an existing lurking bug in the way this was done before.	2011-10-04 16:01:56 -07:00
Matt Pharr	880cbb18cc	Remove checks to see if system's processor matches the target the code was compiled for. (Preparation for multi-target output.)	2011-10-04 16:01:55 -07:00
Matt Pharr	cb7976bbf6	Added updated task launch implementation that now tracks task groups. Within each function that launches tasks, we now can easily track which tasks that function launched, so that the sync at the end of the function can just sync on the tasks launched by that function (not all tasks launched by all functions.) Implementing this led to a rework of the task system API that ispc generates code to call; the example task systems in examples/tasksys.cpp have been updated to conform to this API. (The updated API is also documented in the ispc user's guide.) As part of this, "launch[n]" syntax was added to launch a number of tasks in a single launch statement, rather than requiring a loop over 'n' to launch n tasks. This commit thus fixes issue #84 (enhancement to launch multiple tasks from a single launch statement) as well as issue #105 (recursive task launches were broken).	2011-09-30 11:20:53 -07:00
Matt Pharr	99221f7d17	Fix a few places in examples where C reference implementaion had a double-precision fp constant undesirably causing computation to be done in double precision. Makes C scalar versions of the options pricing models, rt, and aobench 3-5% faster. Makes scalar version of noise about 15% faster. Others are unchanged.	2011-09-01 16:31:22 -07:00
Matt Pharr	96af08e789	Print notices about image files being written	2011-08-16 06:31:26 +01:00
Matt Pharr	6b0a6c0124	Fix issue #67 : don't crash ungracefully if target ISA not supported on system. - In the ispc-generated header files, a #define now indicates which compilation target was used. - The examples use utility routines from the new file examples/cpuid.h to check the system's CPU's capabilities to see if it supports the ISA that was used for compiling the example code and print error messages if things aren't going to work...	2011-07-18 12:29:43 +01:00
Matt Pharr	46ccc251c8	Added C preprocessor support for Windows. Link the appropriate clang libraries to make the preprocessor stuff work on Windows builds. Also updated the solution files for the examples to stop using cl.exe for preprocessing but to just call ispc directly. Finishes fixes for issue #32.	2011-07-04 05:01:04 -07:00
Matt Pharr	bcae21dbca	Update examples to use fpmath:fast and to enable intrinsics on Windows	2011-06-30 13:17:14 -07:00
Andreas Wendleder	39542f420a	Ignore built files.	2011-06-23 16:06:38 -07:00
Matt Pharr	e5bc6cd67c	Update examples/ Makefiles to make x86-64 explicit in compiler flags	2011-06-23 10:00:07 -07:00
Matt Pharr	18af5226ba	Initial commit.	2011-06-21 12:48:50 -07:00

18 Commits