aaron/ispc - ispc - git.frat.tech

aaron/ispc

Author	SHA1	Message	Date
Pierre-Antoine Lacaze	a0e9793de3	Shut up warning wrt CONSOLE_SCREEN_BUFFER_INFO initialization	2012-01-09 10:19:46 +01:00
Pierre-Antoine Lacaze	da9200fcee	Fix alloca use on mingw.	2012-01-09 10:19:09 +01:00
Pierre-Antoine Lacaze	54e8e8022b	suppress warnings about long long arguments	2012-01-09 10:18:39 +01:00
Pierre-Antoine Lacaze	d84cf781da	Mingw does not have sysconf, use the msc way of finding processors.	2012-01-09 09:45:40 +01:00
Pierre-Antoine Lacaze	002f27a30f	Implement vasprintf and asprintf for platforms lacking them.	2012-01-09 09:44:58 +01:00
Matt Pharr	86d88e9773	run_tests.py: fix to use multiple cores on windows, ignore non ispc inputs	2012-01-08 15:29:20 -08:00
Matt Pharr	fda00afe6e	Rename .txt files in docs to .rst (which is what they actually are).	2012-01-08 14:11:04 -08:00
Matt Pharr	be0c77d556	Detect more gather/scatter cases that are actually base+offsets. We now recognize patterns like (ptr + offset1 + offset2) as being cases we can handle with the base_offsets variants of the gather/scatter functions. (This can come up with multidimensional array indexing, for example.) Issue #150.	2012-01-08 14:06:44 -08:00
Matt Pharr	0ed11a7832	Compute SizeOf() and OffsetOf() at compile time in more cases with the generic target. Really, we only have to be careful about the case where there is a vector of bools (i.e. a mask) involved, since the size of that isn't known at compile-time. (Currently, at least.)	2012-01-08 14:06:44 -08:00
Matt Pharr	ff6971fb15	Use Assert() rather than assert()	2012-01-08 14:06:44 -08:00
Matt Pharr	5b4dbc8167	Fix build of aobench_instrumented example on OSX/Linux	2012-01-08 10:02:43 -08:00
Matt Pharr	8da9be1a09	Add support for 'k', 'M', and 'G' suffixes to integer constants. (Denoting units of 1024, 10241024, and 10241024*1024, respectively.) Issue #128.	2012-01-06 14:47:47 -08:00
Matt Pharr	11033e108e	Fix bug that prohibited assignments with pointer expressions on the LHS Previously, code like "*(ptr+1) = foo" would claim that the LHS was invalid for an assignment expression. Issue #138.	2012-01-06 14:21:03 -08:00
Matt Pharr	4f97262cf2	Support function declarations in the definitions of other functions. As part of this, function declarations are no longer scoped (this is permitted by the C standard, as it turns out.) So code like: void foo() { void bar(); } void bat() { bar(); } Compiles correctly; the declaration of bar() in foo() is still available in the definition of bar(). Fixes issue #129.	2012-01-06 13:50:10 -08:00
Matt Pharr	9b68b9087a	Fix crash with anonymous function parameters in function definitions. Issue #135.	2012-01-06 13:28:06 -08:00
Matt Pharr	15cc812e37	Add notion of "unbound" variability to the type system. Now, when a type is declared without an explicit "uniform" or "varying" qualifier, its variability is unbound; depending on the context of the declaration, the variability is later finalized. Currently, in almost all cases, types with unbound variability are resolved to varying types; the one exception is typecasts like: "(int)1"; in this case, the fact that (int) has unbound variability carries through to the TypeCastExpr, which in turn notices that the expression being type cast has uniform type and in turn will resolve (int) to (uniform int). Fixes issue #127.	2012-01-06 11:52:58 -08:00
Matt Pharr	71317e6aa6	Fix bug in gather/scatter optimization passes. When flattening chains of insertelement instructions, we didn't handle the case where the initial insertelement was to a constant vector (with one value set and the other values undef). Also generalized the "do all of the instances access the same location" check to handle the case where some of them are accessing undef locations; these are ignored in this check, as they should correspond to the mask being off for that lane anyway. Fixes issue #149.	2012-01-06 09:19:18 -08:00
Matt Pharr	1abaaee73e	Fix bug where we'd sometimes inadvertently lose cv-qualifiers on pointers. Fixes issue #142.	2012-01-06 08:41:01 -08:00
Matt Pharr	78c6d3c02f	Add initial support for 'goto' statements. ispc now supports goto, but only under uniform control flow--i.e. it must be possible for the compiler to statically determine that all program instances will follow the goto. An error is issued at compile time if a goto is used when this is not the case.	2012-01-05 12:22:36 -08:00
Matt Pharr	48e9d4af39	Emit code for #includes in emitted C++ code all at the start of the file.	2012-01-05 12:22:35 -08:00
Matt Pharr	cb7ad371c6	Run tests using -O2. Disconcertingly, this seems to fix some gcc-only crashes with the generic-16 target (specifically, for half.ispc and for goto-[23].ispc-- those tests run fine with other compilers with generic-16.)	2012-01-05 12:22:35 -08:00
Matt Pharr	2951589825	Redo readme in better-looking rst form	2012-01-04 15:32:29 -08:00
Matt Pharr	f23dc5366a	Updates to run_tests.py script. Add support for the generic targets (using the headers in examples/intrinsics if none is provided.) Provide option to run valgrind on the compiled code. Print a list of all failing tests at the end.	2012-01-04 12:59:03 -08:00
Matt Pharr	e3341176c5	Redo makefiles for the examples. They're all based off a common examples/common.mk file, so that individual makefiles are quite simple now. The common.mk file also provides targets to build the examples using C++ output with the generic-16h or sse4.h files. These targets don't run by default, but do run if 'make all' is run.	2012-01-04 12:59:03 -08:00
Matt Pharr	8938e14442	Add support for emitting ~generic vectorized C++ code. The compiler now supports an --emit-c++ option, which generates generic vector C++ code. To actually compile this code, the user must provide C++ code that implements a variety of types and operations (e.g. adding two floating-point vector values together, comparing them, etc). There are two examples of this required code in examples/intrinsics: generic-16.h is a "generic" 16-wide implementation that does all required with scalar math; it's useful for demonstrating the requirements of the implementation. Then, sse4.h shows a simple implementation of a SSE4 target that maps the emitted function calls to SSE intrinsics. When using these example implementations with the ispc test suite, all but one or two tests pass with gcc and clang on Linux and OSX. There are currently ~10 failures with icc on Linux, and ~50 failures with MSVC 2010. (To be fixed in coming days.) Performance varies: when running the examples through the sse4.h target, some have the same performance as when compiled with --target=sse4 from ispc directly (options), while noise is 12% slower, rt is 26% slower, and aobench is 2.2x slower. The details of this haven't yet been carefully investigated, but will be in coming days as well. Issue #92.	2012-01-04 12:59:03 -08:00
Matt Pharr	4151778f5e	Modify SizeOf() and StructOffset() to not compute value based on target for generic targets. Specifically, we want to be able to late-bind on whether the mask is i32s or i1s, so if there's any chance of ambiguity, we emit code that does the "GEP from a NULL base pointer" trick to compute the value later in compilation.	2012-01-04 12:59:03 -08:00
Matt Pharr	23b85cd88d	Remove broken debugging code.	2012-01-04 12:59:03 -08:00
Matt Pharr	234e5cd3e1	Use vector select for masked store blend if building with LLVM3.1	2012-01-04 12:59:03 -08:00
Matt Pharr	f75c94a8f1	Have aos/soa and broadcast/shuffle/rotate functions provided by the target. The SSE/AVX targets use the old versions from util.m4, but these functions are now passed through to the generic targets.	2012-01-04 12:59:03 -08:00
Matt Pharr	848a432640	Fix various small things that were broken with single-bit-per-lane masks. Also small cleanups to declarations, "no captures" added, etc.	2012-01-04 12:59:03 -08:00
Matt Pharr	dea13979e0	Fix bug in lIs248Splat() in opt.cpp	2012-01-04 11:55:02 -08:00
Matt Pharr	052d34bf5b	Various cleanups to optimization code. Stop using the PassManagerBuilder but add all of the passes directly in code here. This currently leads to no different behavior, but was useful with experimenting with disabling the SROA pass when compiling to generic targets.	2012-01-04 11:54:44 -08:00
Matt Pharr	d4c5e82896	Add VSelMovMsk optimization pass. Various peephole improvements to vector select instructions.	2012-01-04 11:52:27 -08:00
Matt Pharr	562d61caff	Added masked load optimization pass. This pass handles the "all on" and "all off" mask cases appropriately. Also renamed load_masked stuff in built-ins to masked_load for consistency with masked_store.	2012-01-04 11:51:26 -08:00
Matt Pharr	75f18c7c66	Add buildispc.bat script for just building the compiler on windows.	2012-01-04 11:44:19 -08:00
Matt Pharr	5d35349dc9	We were (unintentionally) only using structural equivalence to compare struct types. Now we require that the struct name match for two struct types to be the same. Added a test to check this. (Also removed a stale test, movmsk-opt.ispc)	2012-01-04 11:44:00 -08:00
Matt Pharr	1a81173c93	Fix examples/options Makefile to use -O3 for serial builds. Amazingly, it has been using just -g since the initial commit. :-(	2012-01-03 19:53:45 -08:00
Matt Pharr	1d9201fe3d	Add "generic" 4, 8, and 16-wide targets. When used, these targets end up with calls to undefined functions for all of the various special vector stuff ispc needs to compile ispc programs (masked store, gather, min/max, sqrt, etc.). These targets are not yet useful for anything, but are a step toward having an option to C++ code with calls out to intrinsics. Reorganized the directory structure a bit and put the LLVM bitcode used to define target-specific stuff (as well as some generic built-ins stuff) into a builtins/ directory. Note that for building on Windows, it's now necessary to set a LLVM_VERSION environment variable (with values like LLVM_2_9, LLVM_3_0, LLVM_3_1svn, etc.)	2011-12-19 13:46:50 -08:00
Matt Pharr	6dbb15027a	Take advantage of x86's free "scale by 2, 4, or 8" in addressing calculations When loading from an address that's computed by adding two registers together, x86 can scale one of them by 2, 4, or 8, for free as part of the addressing calculation. This change makes the code generated for gather and scatter use this. For the cases where gather/scatter is based on a base pointer and an integer offset vector, the GSImprovementsPass looks to see if the integer offsets are being computed as 2/4/8 times some other value. If so, it extracts the 2x/4x/8x part and leaves the rest there as the the offsets. The {gather,scatter}_base_offsets_* functions take an i32 scale factor, which is passed to them, and then they carefully generate IR so that it hits LLVM's pattern matching for these scales. This is particular win on AVX, since it saves us two 4-wide integer multiplies. Noise runs 14% faster with this. Issue #132.	2011-12-16 15:55:44 -08:00
Matt Pharr	f23d030e43	Transition EstimateCost() AST traversal to WalkAST() as well.	2011-12-16 12:24:51 -08:00
Matt Pharr	701334ccf2	Transition type checking to use WalkAST() infrastructure.	2011-12-16 12:24:51 -08:00
Matt Pharr	f48a662ed3	Rewrite AST optimization infrastructure to be built on top of WalkAST(). Specifically, stmts and exprs are no longer responsible for first recursively optimizing their children before doing their own optimization (this turned out to be error-prone, with children sometimes being forgotten.) They now are just responsible for their own optimization, when appropriate.	2011-12-16 12:24:51 -08:00
Matt Pharr	ced3f1f5fc	Have WalkAST postorder callback function return an ASTNode * In general, it should just return the original node pointer, but for type checking and optimization passes, it can return a new value for the node (that will be assigned where the old one was in the tree.) Along the way, fixed some bugs in WalkAST() where the postorder callback wouldn't end up being called for a few expr types (sizeof, dereference, address of, reference).	2011-12-16 12:24:51 -08:00
Matt Pharr	018aa96c8b	Remove old code for checking for break/continue under varying control flow.	2011-12-16 12:24:51 -08:00
Matt Pharr	34eda04d9b	Rewrite check for loops for break/continue under varying CF to use WalkAST()	2011-12-16 12:24:51 -08:00
Matt Pharr	45767ad197	Remove no longer needed lSafeToRunWithAllLanesOff utility functions.	2011-12-16 12:24:51 -08:00
Matt Pharr	f9463af75b	Add WalkAST() function for generic AST walking. For starters, use it for the check to see if code is safe to run with the mask all off. This also fixes a bug where we would sometimes incorrectly say that a whole block of code was unsafe to run with an all off mask because we came to a NULL AST node during traversal.	2011-12-16 12:24:51 -08:00
Matt Pharr	6f6e28077f	Release notes and doxygen bump for 1.1.1 v1.1.1	2011-12-15 13:17:08 -08:00
Matt Pharr	0a9a7c939a	Fix test runner script to not crash if one of the tests_errors didn't return the expected result.	2011-12-15 12:38:41 -08:00
Matt Pharr	f30a5dea79	Linux build fixes	2011-12-15 12:23:26 -08:00

1 2 3 4 5 ...

475 Commits