aaron/ispc - ispc - git.frat.tech

aaron/ispc

Author	SHA1	Message	Date
Matt Pharr	664dc3bdda	Add support for "new" and "delete" to the language. Issue #139.	2012-01-27 14:47:06 -08:00
Matt Pharr	748b292e77	Improve code for uniform switches with a 'break' under varying control flow. Previously, when we had a switch statement with a uniform switch condition but a 'break' statement that was under varying control flow inside the switch, we'd promote the switch condition to be varying so that the break would work correctly. Now, we leave the condition as uniform and are thus able to use the more-efficient LLVM switch instruction in this case. Issue #156.	2012-01-19 08:41:19 -07:00
Matt Pharr	7045b76f84	Improvements to code generation for "foreach" Specialize the code for the innermost loop to not do any masking computations for the innermost dimension for the iterations where we are certainly working on a full vector's worth of data. This fix improves performance/code quality of "foreach" such that it's essentially the same as the equivalent "for" loop. Fixes issue #151.	2012-01-17 11:34:00 -08:00
Matt Pharr	b67446d998	Add support for "switch" statements. Switches with both uniform and varying "switch" expressions are supported. Switch statements with varying expressions and very large numbers of labels may not perform well; some issues to be filed shortly will track opportunities for improving these.	2012-01-11 09:16:31 -08:00
Matt Pharr	78c6d3c02f	Add initial support for 'goto' statements. ispc now supports goto, but only under uniform control flow--i.e. it must be possible for the compiler to statically determine that all program instances will follow the goto. An error is issued at compile time if a goto is used when this is not the case.	2012-01-05 12:22:36 -08:00
Matt Pharr	4151778f5e	Modify SizeOf() and StructOffset() to not compute value based on target for generic targets. Specifically, we want to be able to late-bind on whether the mask is i32s or i1s, so if there's any chance of ambiguity, we emit code that does the "GEP from a NULL base pointer" trick to compute the value later in compilation.	2012-01-04 12:59:03 -08:00
Matt Pharr	848a432640	Fix various small things that were broken with single-bit-per-lane masks. Also small cleanups to declarations, "no captures" added, etc.	2012-01-04 12:59:03 -08:00
Matt Pharr	1d9201fe3d	Add "generic" 4, 8, and 16-wide targets. When used, these targets end up with calls to undefined functions for all of the various special vector stuff ispc needs to compile ispc programs (masked store, gather, min/max, sqrt, etc.). These targets are not yet useful for anything, but are a step toward having an option to C++ code with calls out to intrinsics. Reorganized the directory structure a bit and put the LLVM bitcode used to define target-specific stuff (as well as some generic built-ins stuff) into a builtins/ directory. Note that for building on Windows, it's now necessary to set a LLVM_VERSION environment variable (with values like LLVM_2_9, LLVM_3_0, LLVM_3_1svn, etc.)	2011-12-19 13:46:50 -08:00
Matt Pharr	8d1b77b235	Have assertion macro and FATAL() text ask user to file a bug, provide URL to do so. Switch to Assert() from assert() to make it clear it's not the C stdlib one we're using any more.	2011-12-15 11:11:16 -08:00
Matt Pharr	10ebe88abf	Directly emit code for the mask checks at the start of complex functions. Previously, we used an IfStmt to wrap complex functions with the equivalent of a "cif" to check to see if the mask was all on, all off, or mixed at the start of executing non-trivial functions. This had the unintended side effect of suggesting to other parts of the compiler that the entire function was under varying control flow (which in turn led to some small code quality issues.) Now, we emit the equivalent code directly.	2011-12-15 06:00:41 -08:00
Matt Pharr	9920b30318	Fix bug that led to incorrect code with return statements. The conceptual error was the assumption that not being under varying control flow implied that the mask was all on; this is not the case if some of the instances have executed a return earlier in the function's execution. The error in practice would be that the mask would be assumed to be all-on for things like memory writes, so there would be unintended side-effects for the instances that had returned.	2011-12-15 06:00:31 -08:00
Matt Pharr	6f26ae9801	Fix bugs with offsetting for varying values with gathers/scatters. Fixes issue #134.	2011-12-12 14:13:46 -08:00
Matt Pharr	5b48354d9a	Fix crashes from malformed programs.	2011-12-12 13:47:46 -08:00
Matt Pharr	46bfef3fce	Add option to turn off codegen improvements when mask 'all on' is statically known.	2011-12-11 16:16:36 -08:00
Matt Pharr	f6605ee465	Small cleanup: allocate storage for the full mask in the FunctionEmitContext constructor	2011-12-10 13:33:28 -08:00
Matt Pharr	198aa9620e	Fix bug with mask used for gather/scatter code generation. We should always use the full mask for this, never the internal mask. Added tests for this.	2011-12-06 15:51:56 -08:00
Matt Pharr	f95504fb5e	Symbol table now properly handles scopes for function declarations. Previously, they all went into one big pile that was never cleaned up; this was the wrong thing to do in a world where one might have a function declaration inside another functions, say.	2011-12-04 17:37:13 -08:00
Matt Pharr	a1c0b4f95a	Allow 'continue' statements in 'foreach' loops.	2011-12-03 09:31:02 -08:00
Matt Pharr	8bc7367109	Add foreach and foreach_tiled looping constructs These make it easier to iterate over arbitrary amounts of data elements; specifically, they automatically handle the "ragged extra bits" that come up when the number of elements to be processed isn't evenly divided by programCount. TODO: documentation	2011-11-30 13:17:31 -08:00
Matt Pharr	7a2561c429	Add count_{leading,trailing}_zeros() functions to stdlib. (Documentation is still yet to be written.)	2011-11-30 10:12:16 -08:00
Matt Pharr	b1ae307163	Fix bug in FunctionEmitContext::SyncInst() The launch group handle is now reset to NULL after sync is called; this ensures that if tasks are launched in the same function after a sync, that the ISPCAlloc() call for the next launch will be passed a NULL handle (as it should be).	2011-11-29 13:26:48 -08:00
Matt Pharr	e52104ff55	Pointer fixes/improvements. Allow <, <=, >, >= comparisons of pointers Allow explicit type-casting of pointers to and from integers Fix bug in handling expressions of the form "int + ptr" ("ptr + int" was fine). Fix a bug in TypeCastExpr where varying -> uniform typecasts would be allowed (leading to a crash later)	2011-11-29 13:22:36 -08:00
Matt Pharr	975db80ef6	Add support for pointers to the language. Pointers can be either uniform or varying, and behave correspondingly. e.g.: "uniform float * varying" is a varying pointer to uniform float data in memory, and "float * uniform" is a uniform pointer to varying data in memory. Like other types, pointers are varying by default. Pointer-based expressions, & and *, sizeof, ->, pointer arithmetic, and the array/pointer duality all bahave as in C. Array arguments to functions are converted to pointers, also like C. There is a built-in NULL for a null pointer value; conversion from compile-time constant 0 values to NULL still needs to be implemented. Other changes: - Syntax for references has been updated to be C++ style; a useful warning is now issued if the "reference" keyword is used. - It is now illegal to pass a varying lvalue as a reference parameter to a function; references are essentially uniform pointers. This case had previously been handled via special case call by value return code. That path has been removed, now that varying pointers are available to handle this use case (and much more). - Some stdlib routines have been updated to take pointers as arguments where appropriate (e.g. prefetch and the atomics). A number of others still need attention. - All of the examples have been updated - Many new tests TODO: documentation	2011-11-27 13:09:59 -08:00
Matt Pharr	068ea3e4c4	Better SourcePos reporting for gathers/scatters	2011-11-21 10:26:53 -08:00
Matt Pharr	f8eb100c60	Use llvm TargetData to find object sizes, offsets. Previously, to compute the size of objects and the offsets of struct elements within structs, we were using the trick of using getelementpointer with a NULL base pointer and then casting the result to an int32/64. However, since we actually know the target we're compiling for at compile time, we can use corresponding methods from TargetData to get these values directly. This mostly cleans up code, but may make some of the gather/scatter lowering to loads/stores optimizations work better in the presence of structures.	2011-11-06 19:31:19 -08:00
Matt Pharr	b0d476fcdc	Stop zero-initializing memory used to store return values. This seems to have a noticable (small) performance benefit on a few of the example workloads.	2011-11-05 09:49:44 -07:00
Matt Pharr	afcd42028f	Add support for function pointers. Both uniform and varying function pointers are supported; when a function is called through a varying function pointer, each unique function pointer value across the running program instances is called once for the set of active program instances that want to call it.	2011-11-03 16:14:14 -07:00
Matt Pharr	d528533fba	Add FunctionEmitContext::SmearScalar() method (and use it).	2011-11-03 16:14:14 -07:00
Matt Pharr	43a2d510bf	Incorporate per-lane offsets for varying data in the front-end. Previously, it was only in the GatherScatterFlattenOpt optimization pass that we added the per-lane offsets when we were indexing into varying data. (Specifically, the case of float foo[]; int index; foo[index], where foo is an array of varying elements rather than uniform elements.) Now, this is done in the front-end as we're first emitting code. In addition to the basic ugliness of doing this in an optimization pass, it was also error-prone to do it there, since we no longer have access to all of the type information that's around in the front-end. No functionality or performance change.	2011-11-03 13:15:07 -07:00
Matt Pharr	e009c0a61d	Be able to determine if two types can be converted without requiring an Expr *. The Expr::TypeConv() method has been replaced with both a CanConvertTypes() routine that indicates whether one type can be converted to another and a TypeConvertExpr() routine that provides the same functionality as Expr::TypeConv() used to.	2011-10-30 14:12:12 -07:00
Matt Pharr	074cbc2716	Fix #ifdefs to catch LLVM 3.1svn now as well	2011-10-19 14:01:19 -07:00
Matt Pharr	290032f4f5	Be more careful about using the right mask when emitting gathers. Specifically, we had been using the full mask for all gathers, rather than using the internal mask when we were loading from locally-declared arrays. Thus, given code like: uniform float x[programCount] = { .. . }; float xx = x[programIndex]; Previously we weren't generating a plain vector load to initialize xx, when this code was in a function where it wasn't known that the mask was all on, even though it should have. Now it does.	2011-10-17 20:25:52 -04:00
Matt Pharr	a89e26d725	Improvements to mask management code; removes a number of unnecessary blends. We now maintain a the distinction between the value of the mask passed into a function and the "internal" mask within the function that only accounts for varying control flow within the function. The full mask (the AND of the function mask and the internal mask) must be used for assignments to static and global variables, and reference function parameters. Further, it is the appropriate mask to use for making decisions about varying control flow. However, we can use the internal mask for assignments to variables declared in the current function (including the return value and non-reference parameters to the function). Doing so allows us to catch a few more cases where the internal mask is all on, even if the mask coming into the function wasn't all on, and thence use moves rather than blends for those assignments. (Which in turn can allow additional optimizations to happen.) Fixes issue #23.	2011-10-10 11:47:19 -07:00
Matt Pharr	f9c67ff806	Explicit representation of ASTs for all the functions in a compile unit. Added AST and Function classes. Now, we parse the whole file and build up the AST for all of the functions in the Module before we emit IR for the functions (vs. before, when we generated IR along the way as we parsed the source file.)	2011-10-06 15:35:27 -07:00
Matt Pharr	a6fc657b40	Remove 'externGlobals' member from Module; instead find them when needed via new SymbolTable::GetMatchingVariables method.	2011-10-04 06:36:31 -07:00
Matt Pharr	cb7976bbf6	Added updated task launch implementation that now tracks task groups. Within each function that launches tasks, we now can easily track which tasks that function launched, so that the sync at the end of the function can just sync on the tasks launched by that function (not all tasks launched by all functions.) Implementing this led to a rework of the task system API that ispc generates code to call; the example task systems in examples/tasksys.cpp have been updated to conform to this API. (The updated API is also documented in the ispc user's guide.) As part of this, "launch[n]" syntax was added to launch a number of tasks in a single launch statement, rather than requiring a loop over 'n' to launch n tasks. This commit thus fixes issue #84 (enhancement to launch multiple tasks from a single launch statement) as well as issue #105 (recursive task launches were broken).	2011-09-30 11:20:53 -07:00
Matt Pharr	2405dae8e6	Use malloc() to get space for task arguments when compiling to AVX. This is to work around the LLVM bug/limitation discused in LLVM bug 10841 (http://llvm.org/bugs/show_bug.cgi?id=10841).	2011-09-17 13:38:51 -07:00
Matt Pharr	3607f3e045	Remove support for building with LLVM 2.8. Fixes issue #66 . Both 2.9 and top-of-tree generate substantially better code than LLVM 2.8 did, so it's not worth fixing the 2.8 build.	2011-09-17 13:18:59 -07:00
Matt Pharr	1dedd88132	Improve implementaton of 'are both masks equal' check for AVX. Previously, we did a vector equal compare and then a movmsk, the result of which we checked to see if it was on for all lanes. Because masks are vectors of i32s, under AVX, the vector equal compare required two 4-wide SSE compares and some shuffling. Now, we do a movmsk of both masks first and then a scalar equality comparison of those two values, which seems to generate overall better code.	2011-09-15 06:25:02 -07:00
Matt Pharr	922dbdec06	Fixes to build with LLVM top-of-tree	2011-07-26 10:57:49 +01:00
Matt Pharr	bba7211654	Add support for int8/int16 types. Addresses issues #9 and #42 .	2011-07-21 06:57:40 +01:00
Matt Pharr	654cfb4b4b	Many fixes for recent LLVM dev tree API changes	2011-07-18 15:54:39 +01:00
Matt Pharr	f0f876c3ec	Add support for enums.	2011-07-17 16:43:05 +02:00
Matt Pharr	6e4c165c7e	Use malloc to allocate storage for task parameters on Windows. Fixes bug #55. A number of tests were crashing on Windows due to the task launch code using alloca to allocate space for the tasks' parameters. On Windows, the stack isn't generally big enough for this to be a good idea. Also added an alignment parmaeter to ISPCMalloc() to pass the alignment requirement along.	2011-07-06 05:53:25 -07:00
Matt Pharr	c14c3ceba6	Provide both signed and unsigned int variants of bitcode-based builtins. When creating function Symbols for functions that were defined in LLVM bitcode for the standard library, if any of the function parameters are integer types, create two ispc-side Symbols: one where the integer types are all signed and the other where they are all unsigned. This allows us to provide, for example, both store_to_int16(reference int a[], uniform int offset, int val) as well as store_to_int16(reference unsigned int a[], uniform int offset, unsigned int val). functions. Added some additional tests to exercise the new variants of these. Also fixed some cases where the __{load,store}_int{8,16} builtins would read from/write to memory even if the mask was all off (which could cause crashes in some cases.)	2011-07-04 12:10:26 +01:00
Matt Pharr	eb22fa6173	Generalize FunctionEmitContext::PtrToIntInst and IntToPtrInst to do the right thing if given a varying lvalue (i.e. an array of pointers). Fixes issue #34.	2011-06-29 12:38:12 +01:00
Matt Pharr	6b153566f3	Simplify a bunch of code by using CollectionType to collect struct codepaths in with array/vector codepaths. (Issue #37).	2011-06-29 07:59:43 +01:00
Matt Pharr	214fb3197a	Initial plumbing to add CollectionType base-class as common ancestor to StructTypes, ArrayTypes, and VectorTypes. Issue #37.	2011-06-29 07:42:09 +01:00
Matt Pharr	ce7978ae74	Align stack-allocated arrays of uniform types to the target vector alignment (they will often be accessed in programCount-sized chunks and this should make that a bit more efficient in the common case). Fixes issue #15	2011-06-28 20:42:18 -07:00
Matt Pharr	865e430b56	Finished updating alignment issues for vector types; don't assume pointers are aligned to the natural vector width.	2011-06-23 18:51:15 -07:00

1 2

53 Commits