aaron/ispc - ispc - git.frat.tech

aaron/ispc

Author	SHA1	Message	Date
Matt Pharr	068ea3e4c4	Better SourcePos reporting for gathers/scatters	2011-11-21 10:26:53 -08:00
Matt Pharr	f8eb100c60	Use llvm TargetData to find object sizes, offsets. Previously, to compute the size of objects and the offsets of struct elements within structs, we were using the trick of using getelementpointer with a NULL base pointer and then casting the result to an int32/64. However, since we actually know the target we're compiling for at compile time, we can use corresponding methods from TargetData to get these values directly. This mostly cleans up code, but may make some of the gather/scatter lowering to loads/stores optimizations work better in the presence of structures.	2011-11-06 19:31:19 -08:00
Matt Pharr	b0d476fcdc	Stop zero-initializing memory used to store return values. This seems to have a noticable (small) performance benefit on a few of the example workloads.	2011-11-05 09:49:44 -07:00
Matt Pharr	afcd42028f	Add support for function pointers. Both uniform and varying function pointers are supported; when a function is called through a varying function pointer, each unique function pointer value across the running program instances is called once for the set of active program instances that want to call it.	2011-11-03 16:14:14 -07:00
Matt Pharr	d528533fba	Add FunctionEmitContext::SmearScalar() method (and use it).	2011-11-03 16:14:14 -07:00
Matt Pharr	43a2d510bf	Incorporate per-lane offsets for varying data in the front-end. Previously, it was only in the GatherScatterFlattenOpt optimization pass that we added the per-lane offsets when we were indexing into varying data. (Specifically, the case of float foo[]; int index; foo[index], where foo is an array of varying elements rather than uniform elements.) Now, this is done in the front-end as we're first emitting code. In addition to the basic ugliness of doing this in an optimization pass, it was also error-prone to do it there, since we no longer have access to all of the type information that's around in the front-end. No functionality or performance change.	2011-11-03 13:15:07 -07:00
Matt Pharr	e009c0a61d	Be able to determine if two types can be converted without requiring an Expr *. The Expr::TypeConv() method has been replaced with both a CanConvertTypes() routine that indicates whether one type can be converted to another and a TypeConvertExpr() routine that provides the same functionality as Expr::TypeConv() used to.	2011-10-30 14:12:12 -07:00
Matt Pharr	074cbc2716	Fix #ifdefs to catch LLVM 3.1svn now as well	2011-10-19 14:01:19 -07:00
Matt Pharr	290032f4f5	Be more careful about using the right mask when emitting gathers. Specifically, we had been using the full mask for all gathers, rather than using the internal mask when we were loading from locally-declared arrays. Thus, given code like: uniform float x[programCount] = { .. . }; float xx = x[programIndex]; Previously we weren't generating a plain vector load to initialize xx, when this code was in a function where it wasn't known that the mask was all on, even though it should have. Now it does.	2011-10-17 20:25:52 -04:00
Matt Pharr	a89e26d725	Improvements to mask management code; removes a number of unnecessary blends. We now maintain a the distinction between the value of the mask passed into a function and the "internal" mask within the function that only accounts for varying control flow within the function. The full mask (the AND of the function mask and the internal mask) must be used for assignments to static and global variables, and reference function parameters. Further, it is the appropriate mask to use for making decisions about varying control flow. However, we can use the internal mask for assignments to variables declared in the current function (including the return value and non-reference parameters to the function). Doing so allows us to catch a few more cases where the internal mask is all on, even if the mask coming into the function wasn't all on, and thence use moves rather than blends for those assignments. (Which in turn can allow additional optimizations to happen.) Fixes issue #23.	2011-10-10 11:47:19 -07:00
Matt Pharr	f9c67ff806	Explicit representation of ASTs for all the functions in a compile unit. Added AST and Function classes. Now, we parse the whole file and build up the AST for all of the functions in the Module before we emit IR for the functions (vs. before, when we generated IR along the way as we parsed the source file.)	2011-10-06 15:35:27 -07:00
Matt Pharr	a6fc657b40	Remove 'externGlobals' member from Module; instead find them when needed via new SymbolTable::GetMatchingVariables method.	2011-10-04 06:36:31 -07:00
Matt Pharr	cb7976bbf6	Added updated task launch implementation that now tracks task groups. Within each function that launches tasks, we now can easily track which tasks that function launched, so that the sync at the end of the function can just sync on the tasks launched by that function (not all tasks launched by all functions.) Implementing this led to a rework of the task system API that ispc generates code to call; the example task systems in examples/tasksys.cpp have been updated to conform to this API. (The updated API is also documented in the ispc user's guide.) As part of this, "launch[n]" syntax was added to launch a number of tasks in a single launch statement, rather than requiring a loop over 'n' to launch n tasks. This commit thus fixes issue #84 (enhancement to launch multiple tasks from a single launch statement) as well as issue #105 (recursive task launches were broken).	2011-09-30 11:20:53 -07:00
Matt Pharr	2405dae8e6	Use malloc() to get space for task arguments when compiling to AVX. This is to work around the LLVM bug/limitation discused in LLVM bug 10841 (http://llvm.org/bugs/show_bug.cgi?id=10841).	2011-09-17 13:38:51 -07:00
Matt Pharr	3607f3e045	Remove support for building with LLVM 2.8. Fixes issue #66 . Both 2.9 and top-of-tree generate substantially better code than LLVM 2.8 did, so it's not worth fixing the 2.8 build.	2011-09-17 13:18:59 -07:00
Matt Pharr	1dedd88132	Improve implementaton of 'are both masks equal' check for AVX. Previously, we did a vector equal compare and then a movmsk, the result of which we checked to see if it was on for all lanes. Because masks are vectors of i32s, under AVX, the vector equal compare required two 4-wide SSE compares and some shuffling. Now, we do a movmsk of both masks first and then a scalar equality comparison of those two values, which seems to generate overall better code.	2011-09-15 06:25:02 -07:00
Matt Pharr	922dbdec06	Fixes to build with LLVM top-of-tree	2011-07-26 10:57:49 +01:00
Matt Pharr	bba7211654	Add support for int8/int16 types. Addresses issues #9 and #42 .	2011-07-21 06:57:40 +01:00
Matt Pharr	654cfb4b4b	Many fixes for recent LLVM dev tree API changes	2011-07-18 15:54:39 +01:00
Matt Pharr	f0f876c3ec	Add support for enums.	2011-07-17 16:43:05 +02:00
Matt Pharr	6e4c165c7e	Use malloc to allocate storage for task parameters on Windows. Fixes bug #55. A number of tests were crashing on Windows due to the task launch code using alloca to allocate space for the tasks' parameters. On Windows, the stack isn't generally big enough for this to be a good idea. Also added an alignment parmaeter to ISPCMalloc() to pass the alignment requirement along.	2011-07-06 05:53:25 -07:00
Matt Pharr	c14c3ceba6	Provide both signed and unsigned int variants of bitcode-based builtins. When creating function Symbols for functions that were defined in LLVM bitcode for the standard library, if any of the function parameters are integer types, create two ispc-side Symbols: one where the integer types are all signed and the other where they are all unsigned. This allows us to provide, for example, both store_to_int16(reference int a[], uniform int offset, int val) as well as store_to_int16(reference unsigned int a[], uniform int offset, unsigned int val). functions. Added some additional tests to exercise the new variants of these. Also fixed some cases where the __{load,store}_int{8,16} builtins would read from/write to memory even if the mask was all off (which could cause crashes in some cases.)	2011-07-04 12:10:26 +01:00
Matt Pharr	eb22fa6173	Generalize FunctionEmitContext::PtrToIntInst and IntToPtrInst to do the right thing if given a varying lvalue (i.e. an array of pointers). Fixes issue #34.	2011-06-29 12:38:12 +01:00
Matt Pharr	6b153566f3	Simplify a bunch of code by using CollectionType to collect struct codepaths in with array/vector codepaths. (Issue #37).	2011-06-29 07:59:43 +01:00
Matt Pharr	214fb3197a	Initial plumbing to add CollectionType base-class as common ancestor to StructTypes, ArrayTypes, and VectorTypes. Issue #37.	2011-06-29 07:42:09 +01:00
Matt Pharr	ce7978ae74	Align stack-allocated arrays of uniform types to the target vector alignment (they will often be accessed in programCount-sized chunks and this should make that a bit more efficient in the common case). Fixes issue #15	2011-06-28 20:42:18 -07:00
Matt Pharr	865e430b56	Finished updating alignment issues for vector types; don't assume pointers are aligned to the natural vector width.	2011-06-23 18:51:15 -07:00
Matt Pharr	b84167dddd	Fixed a number of issues related to memory alignment; a number of places were expecting vector-width-aligned pointers where in point of fact, there's no guarantee that they would have been in general. Removed the aligned memory allocation routines from some of the examples; they're no longer needed. No perf. difference on Core2/Core i5 CPUs; older CPUs may see some regressions. Still need to update the documentation for this change and finish reviewing alignment issues in Load/Store instructions generated by .cpp files.	2011-06-23 18:18:33 -07:00
Pete Couperus	af435e52c1	Minor mods to build on Fedora 15, LLVM 2.8	2011-06-21 22:57:36 -07:00
Matt Pharr	18af5226ba	Initial commit.	2011-06-21 12:48:50 -07:00

30 Commits