aaron/ispc - ispc - git.frat.tech

aaron/ispc

Author	SHA1	Message	Date
Matt Pharr	1ab05c0351	Set a preprocessor #define based on the target ISA. For example, ISPC_TARGET_SSE4 is #defined for the sse4 targets, etc.	2011-10-15 12:00:42 -07:00
Matt Pharr	c21e704a5c	Fix LLVM 2.9 build. Issue #114	2011-10-15 06:48:20 -07:00
Matt Pharr	9f2aa8d92a	Handle ConstantExpressions when computing address+offset vectors for scatter/gather. In particular, this fixes issue #81, where a global variable access was leading to ConstantExpressions showing up in this code, which it wasn't previously expecting.	2011-10-14 11:20:08 -07:00
Matt Pharr	2460fa5c83	Improve gather/scatter optimization passes to handle loops better. Specifically, now we can work through phi nodes in the IR to detect cases where an index value is actually the same across lanes or is linear across the lanes. For example, this is a loop that used to require gathers but is now turned into vector loads: for (int i = programIndex; i < 16; i += programCount) sum += a[i]; Fixes issue #107.	2011-10-13 17:01:25 -07:00
Matt Pharr	dce25249ce	Use the "avoid masked assignments when possible" tricks for pre/post decrement exprs. Also, call out to the subroutine that handles this logic for dealing with call-by-value-return stuff in function calls.	2011-10-13 16:46:30 -07:00
Matt Pharr	61adc74072	Add missing builtins-sse4-common.ll file	2011-10-11 19:40:37 -07:00
Matt Pharr	88e317f1a9	These tests now pass with LLVM ToT	2011-10-11 16:17:50 -07:00
Matt Pharr	49454bc207	Fix silly bug in 16-wide AOS-SOA 3-vector routine	2011-10-11 16:16:56 -07:00
Matt Pharr	286c23426e	Add "double-wide" sse2-x2 target. i.e. run 8 program instances together, along the lines of the double-pumped sse4-x2 target.	2011-10-11 15:17:31 -07:00
Matt Pharr	1198520029	Improve gather->vector load optimization to detect <linear sequence>-<uniform> case. Previously, we didn't handle subtraction ops when deciphering offsets in order to try to change gathers t evictor loads.	2011-10-11 13:24:40 -07:00
Matt Pharr	06d70376ea	Fix to build with LLVM TOT after LLVM API change	2011-10-11 09:26:45 -07:00
Matt Pharr	7cd7ca82d6	Fix some crashes from malformed programs	2011-10-11 08:28:50 -07:00
Matt Pharr	ecda4561bd	Move some tests that now pass with LLVM 3.0 from failing_tests to tests/	2011-10-10 11:51:47 -07:00
Matt Pharr	a89e26d725	Improvements to mask management code; removes a number of unnecessary blends. We now maintain a the distinction between the value of the mask passed into a function and the "internal" mask within the function that only accounts for varying control flow within the function. The full mask (the AND of the function mask and the internal mask) must be used for assignments to static and global variables, and reference function parameters. Further, it is the appropriate mask to use for making decisions about varying control flow. However, we can use the internal mask for assignments to variables declared in the current function (including the return value and non-reference parameters to the function). Doing so allows us to catch a few more cases where the internal mask is all on, even if the mask coming into the function wasn't all on, and thence use moves rather than blends for those assignments. (Which in turn can allow additional optimizations to happen.) Fixes issue #23.	2011-10-10 11:47:19 -07:00
Matt Pharr	3cb0115dce	Add routines to standard library to do efficient AOS/SOA conversions. Currently, we just support 3 and 4-wide variants (i.e. xyzxyz.. and xyzwxyzw..), for int32 and float types.	2011-10-10 10:56:06 -07:00
Matt Pharr	f5391747b9	Remove suggestions from parser that we support "char"	2011-10-10 10:54:06 -07:00
Matt Pharr	b8768ffdfa	Merge branch 'master' of github.com:ispc/ispc	2011-10-07 20:39:44 -07:00
Matt Pharr	6009608bc6	Mark inlined functions as having static linkage	2011-10-07 16:06:14 -07:00
Matt Pharr	ce7355f9ed	Windows: fix examples build to look for ispc.exe in ../.. as well	2011-10-09 07:40:18 -07:00
Matt Pharr	6b4459d402	Windows: fix some compiler warnings during build	2011-10-09 07:40:17 -07:00
Matt Pharr	790dba2558	Doxygen bump and release notes for v1.0.11 v1.0.11	2011-10-07 09:57:55 -07:00
Matt Pharr	4a2cbf2c4e	Fix regression from AST checkin that caused perf. warnings to be issued for stdlib code.	2011-10-07 09:20:48 -07:00
Matt Pharr	53dd65fa2e	Add ispc_test to buildall.bat script	2011-10-08 17:17:05 -07:00
Matt Pharr	f5afa52fd9	Add missing header	2011-10-06 17:10:30 -07:00
Matt Pharr	f9c67ff806	Explicit representation of ASTs for all the functions in a compile unit. Added AST and Function classes. Now, we parse the whole file and build up the AST for all of the functions in the Module before we emit IR for the functions (vs. before, when we generated IR along the way as we parsed the source file.)	2011-10-06 15:35:27 -07:00
Matt Pharr	ec5e627e56	Mark internal stdlib functions as "internal" linkage, not "private". This fixes print() statements on OSX. (http://llvm.org/bugs/show_bug.cgi?id=11080)	2011-10-06 13:32:20 -07:00
Matt Pharr	ff2a43ac19	Run the CFG simplification pass even when optimization is disabled. This fixes an issue with undefined SVML symbols with code that called transcendental functions in the stdandard library, even when the SVML math library hadn't been selected.	2011-10-06 09:20:50 -07:00
Matt Pharr	9feea32471	Fix errors in documentation for some of the reduce_* stdlib functions	2011-10-06 07:52:10 -07:00
Matt Pharr	bedaec2295	Update examples for multi-target compilation. Makefile and vcxproj file updates. Also modified vcxproj files so that the various files ispc generates go into $(TargetDir), not the current directory. Modified the ray tracer example to not have uniform short-vector types in its app-visible datatypes (these are laid out differently on SSE vs AVX); there was an existing lurking bug in the way this was done before.	2011-10-04 16:01:56 -07:00
Matt Pharr	a68d137df6	Documentation update for multi-target compilation.	2011-10-04 16:01:56 -07:00
Matt Pharr	59caa3d4e1	Various small Windows fixes. Also fixed some tabs/spaces and compiler warning issues.	2011-10-04 16:01:56 -07:00
Matt Pharr	06975bc7ab	Add support for compiling to multiple targets. If a flag along the lines of "--target=sse4,avx-x2" is provided on the command-line, then the program will be compiled for each of the given targets, with a separate output file generated for each one. Further, an output file with dispatch functions that check the current system's CPU and then chooses the best available variant is also created. Issue #11.	2011-10-04 16:01:55 -07:00
Matt Pharr	880cbb18cc	Remove checks to see if system's processor matches the target the code was compiled for. (Preparation for multi-target output.)	2011-10-04 16:01:55 -07:00
Matt Pharr	686d9975b6	Add Symbol::exportedFunction member to hold llvm::Function * for app-callable version of function.	2011-10-04 15:56:54 -07:00
Matt Pharr	9b7f55a28e	Add buildall.bat script for Windows. Also various example build fixes for Windows	2011-10-04 11:42:04 -07:00
Matt Pharr	e4d224a0f1	Use __cilk to detect Cilk support	2011-10-04 11:16:42 -07:00
Matt Pharr	0933a77c1b	Improve task decomposition in ray tracing example. Specifically, launch all of the tasks in one statement, rather than still looping over spans in y and launching a collection of tasks across x for each span. This seems to give a few percent better performance.	2011-10-04 09:33:59 -07:00
Matt Pharr	5f78edf07a	Fix bug with screen decomposition in volume rendering example	2011-10-04 09:30:02 -07:00
Matt Pharr	a6fc657b40	Remove 'externGlobals' member from Module; instead find them when needed via new SymbolTable::GetMatchingVariables method.	2011-10-04 06:36:31 -07:00
Matt Pharr	fa5050d5c7	Error reporting improvements. Don't print more than 3 lines of source file context with errors. (Any more than that is almost certainly not the Right Thing to do.) Make some parsing error messages more clear.	2011-10-03 21:09:04 -07:00
Matt Pharr	d5a48d9a1e	Fix incorrect LLVM_3_0svn #ifdefs	2011-10-03 08:29:19 -07:00
Matt Pharr	2df9da2524	Be careful to not inadvertently match NULL functions in optimization passes.	2011-10-01 08:34:11 -07:00
Matt Pharr	0b02f94988	Task system performance tweaks. Switch back to GCD on OSX. Increase TaskInfo allocation count. This fixes the regression with deferred on AVX (from 17x to 25x again with 4 cores.)	2011-10-01 08:04:09 -07:00
Matt Pharr	65c50b60fc	Cleanups to deferred shading workload	2011-09-30 20:35:42 -07:00
Matt Pharr	9de34eb22c	Release notes and doxygen bump for v1.0.10	2011-09-30 19:42:14 -07:00
Matt Pharr	f8f25a11b6	Added deferred shading workload	2011-09-30 19:42:14 -07:00
Matt Pharr	cb7976bbf6	Added updated task launch implementation that now tracks task groups. Within each function that launches tasks, we now can easily track which tasks that function launched, so that the sync at the end of the function can just sync on the tasks launched by that function (not all tasks launched by all functions.) Implementing this led to a rework of the task system API that ispc generates code to call; the example task systems in examples/tasksys.cpp have been updated to conform to this API. (The updated API is also documented in the ispc user's guide.) As part of this, "launch[n]" syntax was added to launch a number of tasks in a single launch statement, rather than requiring a loop over 'n' to launch n tasks. This commit thus fixes issue #84 (enhancement to launch multiple tasks from a single launch statement) as well as issue #105 (recursive task launches were broken).	2011-09-30 11:20:53 -07:00
Matt Pharr	5ee4d7fce8	Add comment	2011-09-30 11:11:52 -07:00
Matt Pharr	8f3e46f67e	Use InterlockedExchangeAdd on Windows	2011-09-29 16:19:59 -07:00
Matt Pharr	9ed07ff2b5	Fix __num_cores() definition on Windows to not cause unresolved symbols	2011-09-29 13:35:50 -07:00

... 5 6 7 8 9 ...

595 Commits