aaron/ispc - ispc - git.frat.tech

aaron/ispc

Author	SHA1	Message	Date
Matt Pharr	82aa6efd12	Checkpoint user's guide edits	2011-12-01 13:38:17 -08:00
Matt Pharr	8bc7367109	Add foreach and foreach_tiled looping constructs These make it easier to iterate over arbitrary amounts of data elements; specifically, they automatically handle the "ragged extra bits" that come up when the number of elements to be processed isn't evenly divided by programCount. TODO: documentation	2011-11-30 13:17:31 -08:00
Matt Pharr	11547cb950	stdlib updates to take advantage of pointers The packed_{load,store}_active now functions take a pointer to a location at which to start loading/storing, rather than an array base and a uniform index. Variants of the prefetch functions that take varying pointers are now available. There are now variants of the various atomic functions that take varying pointers (issue #112).	2011-11-29 15:41:38 -08:00
Matt Pharr	975db80ef6	Add support for pointers to the language. Pointers can be either uniform or varying, and behave correspondingly. e.g.: "uniform float * varying" is a varying pointer to uniform float data in memory, and "float * uniform" is a uniform pointer to varying data in memory. Like other types, pointers are varying by default. Pointer-based expressions, & and *, sizeof, ->, pointer arithmetic, and the array/pointer duality all bahave as in C. Array arguments to functions are converted to pointers, also like C. There is a built-in NULL for a null pointer value; conversion from compile-time constant 0 values to NULL still needs to be implemented. Other changes: - Syntax for references has been updated to be C++ style; a useful warning is now issued if the "reference" keyword is used. - It is now illegal to pass a varying lvalue as a reference parameter to a function; references are essentially uniform pointers. This case had previously been handled via special case call by value return code. That path has been removed, now that varying pointers are available to handle this use case (and much more). - Some stdlib routines have been updated to take pointers as arguments where appropriate (e.g. prefetch and the atomics). A number of others still need attention. - All of the examples have been updated - Many new tests TODO: documentation	2011-11-27 13:09:59 -08:00
Matt Pharr	ce7355f9ed	Windows: fix examples build to look for ispc.exe in ../.. as well	2011-10-09 07:40:18 -07:00
Matt Pharr	bedaec2295	Update examples for multi-target compilation. Makefile and vcxproj file updates. Also modified vcxproj files so that the various files ispc generates go into $(TargetDir), not the current directory. Modified the ray tracer example to not have uniform short-vector types in its app-visible datatypes (these are laid out differently on SSE vs AVX); there was an existing lurking bug in the way this was done before.	2011-10-04 16:01:56 -07:00
Matt Pharr	880cbb18cc	Remove checks to see if system's processor matches the target the code was compiled for. (Preparation for multi-target output.)	2011-10-04 16:01:55 -07:00
Matt Pharr	9b7f55a28e	Add buildall.bat script for Windows. Also various example build fixes for Windows	2011-10-04 11:42:04 -07:00
Matt Pharr	e4d224a0f1	Use __cilk to detect Cilk support	2011-10-04 11:16:42 -07:00
Matt Pharr	0933a77c1b	Improve task decomposition in ray tracing example. Specifically, launch all of the tasks in one statement, rather than still looping over spans in y and launching a collection of tasks across x for each span. This seems to give a few percent better performance.	2011-10-04 09:33:59 -07:00
Matt Pharr	5f78edf07a	Fix bug with screen decomposition in volume rendering example	2011-10-04 09:30:02 -07:00
Matt Pharr	0b02f94988	Task system performance tweaks. Switch back to GCD on OSX. Increase TaskInfo allocation count. This fixes the regression with deferred on AVX (from 17x to 25x again with 4 cores.)	2011-10-01 08:04:09 -07:00
Matt Pharr	65c50b60fc	Cleanups to deferred shading workload	2011-09-30 20:35:42 -07:00
Matt Pharr	f8f25a11b6	Added deferred shading workload	2011-09-30 19:42:14 -07:00
Matt Pharr	cb7976bbf6	Added updated task launch implementation that now tracks task groups. Within each function that launches tasks, we now can easily track which tasks that function launched, so that the sync at the end of the function can just sync on the tasks launched by that function (not all tasks launched by all functions.) Implementing this led to a rework of the task system API that ispc generates code to call; the example task systems in examples/tasksys.cpp have been updated to conform to this API. (The updated API is also documented in the ispc user's guide.) As part of this, "launch[n]" syntax was added to launch a number of tasks in a single launch statement, rather than requiring a loop over 'n' to launch n tasks. This commit thus fixes issue #84 (enhancement to launch multiple tasks from a single launch statement) as well as issue #105 (recursive task launches were broken).	2011-09-30 11:20:53 -07:00
Matt Pharr	8f3e46f67e	Use InterlockedExchangeAdd on Windows	2011-09-29 16:19:59 -07:00
Matt Pharr	d45c536c47	Fix Windows debug build of simple example	2011-09-28 14:11:32 -07:00
Matt Pharr	9052d4b10b	Linux build fixes	2011-09-17 13:42:46 -07:00
Matt Pharr	2405dae8e6	Use malloc() to get space for task arguments when compiling to AVX. This is to work around the LLVM bug/limitation discused in LLVM bug 10841 (http://llvm.org/bugs/show_bug.cgi?id=10841).	2011-09-17 13:38:51 -07:00
Matt Pharr	a501ab1aa6	Fix parenthesization bugs in cost estimates. Also added the debugging print that helped find these issues. Revert inlining some functions in examples	2011-09-16 19:07:07 -07:00
Matt Pharr	cdc850f98c	Inline some functions in examples	2011-09-16 17:02:21 -07:00
Matt Pharr	30f9dcd4f5	Unroll loops by default, add --opt=disable-loop-unroll to disable. Issue #78.	2011-09-13 15:37:18 -07:00
Matt Pharr	0c344b6755	Fix Linux build of mandelbrot_tasks example	2011-09-13 15:17:30 -07:00
Matt Pharr	5dedb6f836	Add --scale command line argument to mandelbrot and rt examples. This applies a floating-point scale factor to the image resolution; it's useful for experiments with many-core systems where the base image resolution may not give enough work for good load-balancing with tasks.	2011-09-07 20:07:51 -07:00
Matt Pharr	2ea6d249d5	Fix mapping to 8, 16 program instances in AO bench example. With this, we now compute a correct image with AVX.	2011-09-07 11:34:24 -07:00
Matt Pharr	375f1cb8e8	Make octaves and octaves loop uniform in noise example	2011-09-07 10:34:23 -07:00
Matt Pharr	effe901890	Add task-parallel version of aobench	2011-09-07 05:43:21 -07:00
Matt Pharr	b5bfa43e92	Fix error with float suffixes	2011-09-02 13:09:25 -07:00
Matt Pharr	99221f7d17	Fix a few places in examples where C reference implementaion had a double-precision fp constant undesirably causing computation to be done in double precision. Makes C scalar versions of the options pricing models, rt, and aobench 3-5% faster. Makes scalar version of noise about 15% faster. Others are unchanged.	2011-09-01 16:31:22 -07:00
Matt Pharr	a94cabc692	Modify stencil example to do separate runs with and without task parallelism.	2011-08-30 05:08:21 -07:00
Matt Pharr	33feeffe5d	Update timing header so it works with C code	2011-08-29 11:23:43 -07:00
Matt Pharr	74c2c8ae07	Linux build fixes	2011-08-17 07:08:44 -07:00
Matt Pharr	206c851146	Various improvements to example task systems in examples/. - Only have a single copy of all of the tasks_*.cpp sample implementations, stored in examples/. - Reduce dynamic storage allocation and locking in task launch code paths. - Don't have a hard limit of the number of tasks that can be launched on Windows (fix issue #85).	2011-08-17 14:31:45 +01:00
Matt Pharr	60bdf1ef8a	Modify rt example to also do a set of runs with tasks + SPMD together.	2011-08-17 13:14:32 +01:00
Matt Pharr	d7662b3eb9	Use reduce_equal() in volume rendering example to avoid some gathers. Modified this example to use reduce_equal() to see if all of the program instances want to load the 8 sample values around the same voxel. When this is the case, we can just do 8 scalar loads, rather than needing to do a fully general gather. Once this check fails, it isn't done again, since it's not likely to start succeeding in the future. This gives a ~10% speedup with the low-res data set, and basically no performance difference with the high-res one. (It makes sense that the lower-resolution the voxel sampling, the longer all of the rays will stay in the same set of voxels.)	2011-08-17 12:37:07 +01:00
Matt Pharr	ecaa57c7c6	Add volume rendering example. (~2.3x speedup from SIMD vs serial code.)	2011-08-17 12:05:37 +01:00
Matt Pharr	fce183c244	Merge branch 'master' of github.com:ispc/ispc	2011-08-17 10:32:49 +01:00
Matt Pharr	7a92f8b3f9	Add MSVC build support for stencil example	2011-08-17 02:28:49 -07:00
Matt Pharr	96af08e789	Print notices about image files being written	2011-08-16 06:31:26 +01:00
Matt Pharr	c570108026	Fix linux build of stencil example	2011-08-15 04:44:17 -07:00
Matt Pharr	137ea7bde6	Rename semaphore filename to be more generic	2011-08-04 05:28:00 -07:00
Matt Pharr	e05b3981d9	Add stencil example	2011-08-03 13:49:02 -07:00
Matt Pharr	a4bb6b5520	Add new example with implementation of Perlin Noise ~4.2x speedup versus serial on OSX / gcc. ~2.9x speedup versus serial on Windows / MSVC.	2011-08-01 10:33:18 +01:00
Matt Pharr	80ca02af58	Add missing #include, fix Linux build. Fixes issue #75 .	2011-07-27 10:51:13 +01:00
Matt Pharr	bba7211654	Add support for int8/int16 types. Addresses issues #9 and #42 .	2011-07-21 06:57:40 +01:00
Matt Pharr	6b0a6c0124	Fix issue #67 : don't crash ungracefully if target ISA not supported on system. - In the ispc-generated header files, a #define now indicates which compilation target was used. - The examples use utility routines from the new file examples/cpuid.h to check the system's CPU's capabilities to see if it supports the ISA that was used for compiling the example code and print error messages if things aren't going to work...	2011-07-18 12:29:43 +01:00
Matt Pharr	213c3a9666	Release notes, bump doxygen version # for next release. Add more .gitignore stuff.	2011-07-17 16:52:36 +02:00
Matt Pharr	6e4c165c7e	Use malloc to allocate storage for task parameters on Windows. Fixes bug #55. A number of tests were crashing on Windows due to the task launch code using alloca to allocate space for the tasks' parameters. On Windows, the stack isn't generally big enough for this to be a good idea. Also added an alignment parmaeter to ISPCMalloc() to pass the alignment requirement along.	2011-07-06 05:53:25 -07:00
Matt Pharr	46ccc251c8	Added C preprocessor support for Windows. Link the appropriate clang libraries to make the preprocessor stuff work on Windows builds. Also updated the solution files for the examples to stop using cl.exe for preprocessing but to just call ispc directly. Finishes fixes for issue #32.	2011-07-04 05:01:04 -07:00
Matt Pharr	6ed6961958	Add checks to sample task systems to ensure that TasksInit has been called; if not, print an informative error message.	2011-07-01 14:11:16 +01:00

... 2 3 4 5 6

258 Commits