aaron/ispc - ispc - git.frat.tech

aaron/ispc

Author	SHA1	Message	Date
Matt Pharr	c2ecc15b93	Add missing "varying/varying" atomic_compare_exchange_global() functions.	2012-02-03 13:19:15 -08:00
Matt Pharr	83c8650b36	Add support for "local" atomics. Also updated aobench example to use them, which in turn allows using foreach() and thence a much cleaner implementation. Issue #58.	2012-02-03 13:15:21 -08:00
Matt Pharr	89cb809922	Short-circuit evaluation of ? : operator for varying tests. ? : now short-circuits evaluation of the expressions following the boolean test for varying test types. (It already did this for uniform tests). Issue #169.	2012-02-01 11:03:58 -08:00
Matt Pharr	fdb4eaf437	Fix bug in &&/\|\| short-circuiting. Use full mask, not internal mask when checking "any lanes running" before evaluating expressions. Added some more tests to try to cover this case.	2012-02-01 08:17:25 -08:00
Matt Pharr	0432f97555	Fix build with LLVM 3.1 TOT	2012-01-31 14:10:07 -08:00
Matt Pharr	8d1631b714	Constant fold in SelectExpr::Optimize(). Resolves issue #170.	2012-01-31 12:22:11 -08:00
Matt Pharr	dac091552d	Fix errors in tests for scalar target. Issue #167.	2012-01-31 11:57:12 -08:00
Matt Pharr	ea027a95a8	Fix various places in deferred shading example that assumed programCount >= 4. This gets deferred closer to working with the scalar target, but there are still some issues. (Partially in gamma correction / final clamping, it seems.) This fix causes a ~0.5% performance degradation with e.g. the AVX target, though it's not clear that it's worth having a separate code path in order to not lose this small amount of perf. (Partially addresses issue #167)	2012-01-31 11:46:33 -08:00
Matt Pharr	f73abb05a7	Fix bug in handling scatters where all instances go to the same location. Previously, we'd pick one lane and generate a regular store for its value. This was the wrong thing to do, since we also should have been checking that the mask was on (for the lane that was chosen). This bug didn't become evident until the scalar target was added, since many stores fall into this case with that target. Now, we just leave those as regular scatters. Fixes most of the failing tests for the scalar target listed in issue #167.	2012-01-31 11:06:14 -08:00
Matt Pharr	d71c49494f	Missed pass that should be skipped when pseudo memory ops are supposed to be left unchanged.	2012-01-31 11:02:23 -08:00
Matt Pharr	25665f0841	Implement NullPointerExpr::GetConstant() Also reworked TypeCastExpr::GetConstant() to just forward the request along and moved the code that was previously there to handle uniform->varying smears of function pointers to FunctionSymbolExpr::GetConstant(). Fixes issue #168.	2012-01-31 09:37:39 -08:00
Matt Pharr	1eec27f890	Scalar target fixes. Don't issue warnings about all instances writing to the same location if there is only one program instance in the gang. Be sure to report that all values are equal in one-element vectors in LLVMVectorValuesAllEqual(). Issue #166.	2012-01-31 08:52:11 -08:00
Matt Pharr	950f86200b	Fix examples/tasksys.cpp to compile with 32-bit targets. (Change a cmpxchgd to cmpxchl.) Note that a number of the examples still don't work with 32-bit compilation, why still TBD.	2012-01-30 15:03:54 -08:00
Matt Pharr	e19f4931d1	Short-circuit evaluation of && and \|\| operators. We now follow C's approach of evaluating these: we don't evaluate the second expression in the operator if the value of the first one determines the overall result. Thus, these can now be used idiomatically like (index < limit && array[index] > 0) and such. For varying expressions, the mask is set appropriately when evaluating the second expression. (For expressions that can be determined to be both simple and safe to evaluate with the mask all off, we still evaluate both sides and compute the logical op result directly, which saves a number of branches and tests. However, the effect of this should never be visible to the programmer.) Issue #4.	2012-01-30 05:58:41 -08:00
Matt Pharr	0575b1f38d	Update run_tests and examples makefile for scalar target. Fixed a number of tests that didn't handle the programCount == 1 case correctly.	2012-01-29 16:22:25 -08:00
Matt Pharr	f6cd01f7cf	Windows build support for scalar target.	2012-01-29 13:48:01 -08:00
Matt Pharr	f2fbc168af	Scalar target builtins bugfixes. Typo in __max_varying_double. Add declarations for half functions. Use the gen_scatter macro to get the scatter functions.	2012-01-29 13:47:44 -08:00
Matt Pharr	b50f6f1730	Fix RNG seed code in stdlib for scalar target.	2012-01-29 13:46:57 -08:00
Matt Pharr	f8a7120d9c	Detect division by 0 during constant folding and issue a sensible error.	2012-01-29 13:46:38 -08:00
Matt Pharr	20dbf59420	Don't lose source position when returning values of constant symbols.	2012-01-29 13:46:17 -08:00
Gabe Weisz	c67a286aa6	Add support for 1-wide scalar target. Issue #40.	2012-01-29 06:36:07 -08:00
Matt Pharr	c96fef6bc8	Fix silly error in generic-16.h example C++ bindings.	2012-01-27 17:04:57 -08:00
Matt Pharr	bba02f87ea	Improve implementations of unsigned <=, >= in sse4 intrinsics file.	2012-01-27 16:49:41 -08:00
Matt Pharr	12dc3f5c28	Fixes to c++ backend for new and delete Don't include declarations of malloc/free in the generated code (get the standard ones from system headers instead). Add a cast to (uint8_t ) before calls to malloc, which C++ requires, since proper malloc returns a void .	2012-01-27 16:49:09 -08:00
Matt Pharr	0f01a5dcbe	Handle undef values in LLVMVectorValuesAllEqual()	2012-01-27 16:48:14 -08:00
Matt Pharr	664dc3bdda	Add support for "new" and "delete" to the language. Issue #139.	2012-01-27 14:47:06 -08:00
Matt Pharr	bdba3cd97d	Bugfix: add per-lane offsets when accessing varying data through a pointer!	2012-01-27 14:44:52 -08:00
Matt Pharr	d9c0f9315a	Fix generic targets: half conversion functions weren't declared. (Broken by `1867b5b31`).	2012-01-27 14:44:43 -08:00
Matt Pharr	b7f17d435f	Fix crash in gather/scatter optimization pass.	2012-01-27 14:44:35 -08:00
Matt Pharr	37cdc18639	Issue error instead of crashing given attempted function call through non-function. Fixes issue #163.	2012-01-27 10:01:06 -08:00
Matt Pharr	5893a9c49d	Remove incorrect assert	2012-01-27 09:14:45 -08:00
Matt Pharr	24f58fa16a	Update per_lane macro to not use ID for lane number in macro expansion This was leading to unintended consequences if WIDTH was used in macro code, which was undesirable.	2012-01-27 09:12:13 -08:00
Matt Pharr	56ffc78fa4	Require semicolons after sync, assert, and print statements. (Silly parser oversight.)	2012-01-27 09:12:13 -08:00
Matt Pharr	061e68bc77	Fix compiler crash from malformed program.	2012-01-27 09:12:13 -08:00
Matt Pharr	177e6312b4	Fix build with LLVM ToT (ConstantVector::getVectorElements() is gone now).	2012-01-27 09:07:58 -08:00
Matt Pharr	1acf4032c2	Merge branch 'master' of https://github.com/jduprat/ispc	2012-01-26 14:18:25 -08:00
Jean-Luc Duprat	9c5444698e	run_tests.py fixes: - Python 3 fixes (can't use print) - Fixed for running tests on Windows	2012-01-26 13:39:54 -08:00
Matt Pharr	65f3252760	Various fixes to test running script for Windows. Also, removed the --valgrind option and replaced it with a more general --wrap-exe option, which can be used both for running Valgrind and SDE.	2012-01-26 10:56:29 -08:00
Matt Pharr	e612abe4ba	Fix parsing of 64-bit integer constants on Windows. (i.e., use the 64-bit unsigned integer parsing function, not the 64-bit signed one.) Fixes bug #68.	2012-01-26 10:56:28 -08:00
Jean-Luc Duprat	34352e4e0e	beefed up stdin.h on Windows so it compiles ispc 1.1.3	2012-01-25 15:04:19 -08:00
Matt Pharr	1867b5b317	Use native float/half conversion instructions with the AVX2 target.	2012-01-24 15:33:38 -08:00
Matt Pharr	a5b7fca7e0	Extract constant offsets from gather/scatter base+offsets offset vectors. When we're able to turn a general gather/scatter into the "base + offsets" form, we now try to extract out any constant components of the offsets and then pass them as a separate parameter to the gather/scatter function implementation. We then in turn carefully emit code for the addressing calculation so that these constant offsets match LLVM's patterns to detect this case, such that we get the constant offsets directly encoded in the instruction's addressing calculation in many cases, saving arithmetic instructions to do these calculations. Improves performance of stencil by ~15%. Other workloads unchanged.	2012-01-24 14:41:15 -08:00
Matt Pharr	7be2c399b1	Rename various optimization passes to have more descriptive names. No functionality change.	2012-01-23 14:49:48 -08:00
Matt Pharr	d6337b3b22	Code cleanups in opt.cpp; no functional change	2012-01-23 14:36:32 -08:00
Matt Pharr	d2f8b0ace5	Add __clock to list of symbols to make internal from builtins.	2012-01-23 06:19:16 -08:00
Matt Pharr	d805e8b183	Add clock() function to standard library. Also corrected the declaration of num_cores() to return a uniform value.	2012-01-22 13:05:27 -08:00
Matt Pharr	1f0f2ec05f	Include AVX2 in supported ISAs	2012-01-22 07:05:47 -08:00
Matt Pharr	91ac3b9d7c	Back out WIP changes to opt.cpp that were inadvertently checked in.	2012-01-21 07:34:53 -08:00
Matt Pharr	d65bf2eb2f	Doxygen number bump and release notes for 1.1.3 v1.1.3	2012-01-20 17:04:16 -08:00
Matt Pharr	1bba9d4307	Improve atomic_swap_global() to take advantage of associativity. We now do a single atomic hardware swap and then effectively do swaps between the running program instances such that the result is the same as if they had happened to run a particular ordering of hardware swaps themselves. Also cleaned up __atomic_swap_uniform_* built-in implementations to not take the mask, which they weren't using anyway. Finishes Issue #56.	2012-01-20 10:37:33 -08:00

1 2 3 4 5 ...

564 Commits