aaron/ispc - ispc - git.frat.tech

aaron/ispc

Author	SHA1	Message	Date
Matt Pharr	1eec27f890	Scalar target fixes. Don't issue warnings about all instances writing to the same location if there is only one program instance in the gang. Be sure to report that all values are equal in one-element vectors in LLVMVectorValuesAllEqual(). Issue #166.	2012-01-31 08:52:11 -08:00
Matt Pharr	950f86200b	Fix examples/tasksys.cpp to compile with 32-bit targets. (Change a cmpxchgd to cmpxchl.) Note that a number of the examples still don't work with 32-bit compilation, why still TBD.	2012-01-30 15:03:54 -08:00
Matt Pharr	e19f4931d1	Short-circuit evaluation of && and \|\| operators. We now follow C's approach of evaluating these: we don't evaluate the second expression in the operator if the value of the first one determines the overall result. Thus, these can now be used idiomatically like (index < limit && array[index] > 0) and such. For varying expressions, the mask is set appropriately when evaluating the second expression. (For expressions that can be determined to be both simple and safe to evaluate with the mask all off, we still evaluate both sides and compute the logical op result directly, which saves a number of branches and tests. However, the effect of this should never be visible to the programmer.) Issue #4.	2012-01-30 05:58:41 -08:00
Matt Pharr	0575b1f38d	Update run_tests and examples makefile for scalar target. Fixed a number of tests that didn't handle the programCount == 1 case correctly.	2012-01-29 16:22:25 -08:00
Matt Pharr	f6cd01f7cf	Windows build support for scalar target.	2012-01-29 13:48:01 -08:00
Matt Pharr	f2fbc168af	Scalar target builtins bugfixes. Typo in __max_varying_double. Add declarations for half functions. Use the gen_scatter macro to get the scatter functions.	2012-01-29 13:47:44 -08:00
Matt Pharr	b50f6f1730	Fix RNG seed code in stdlib for scalar target.	2012-01-29 13:46:57 -08:00
Matt Pharr	f8a7120d9c	Detect division by 0 during constant folding and issue a sensible error.	2012-01-29 13:46:38 -08:00
Matt Pharr	20dbf59420	Don't lose source position when returning values of constant symbols.	2012-01-29 13:46:17 -08:00
Gabe Weisz	c67a286aa6	Add support for 1-wide scalar target. Issue #40.	2012-01-29 06:36:07 -08:00
Matt Pharr	c96fef6bc8	Fix silly error in generic-16.h example C++ bindings.	2012-01-27 17:04:57 -08:00
Matt Pharr	bba02f87ea	Improve implementations of unsigned <=, >= in sse4 intrinsics file.	2012-01-27 16:49:41 -08:00
Matt Pharr	12dc3f5c28	Fixes to c++ backend for new and delete Don't include declarations of malloc/free in the generated code (get the standard ones from system headers instead). Add a cast to (uint8_t ) before calls to malloc, which C++ requires, since proper malloc returns a void .	2012-01-27 16:49:09 -08:00
Matt Pharr	0f01a5dcbe	Handle undef values in LLVMVectorValuesAllEqual()	2012-01-27 16:48:14 -08:00
Matt Pharr	664dc3bdda	Add support for "new" and "delete" to the language. Issue #139.	2012-01-27 14:47:06 -08:00
Matt Pharr	bdba3cd97d	Bugfix: add per-lane offsets when accessing varying data through a pointer!	2012-01-27 14:44:52 -08:00
Matt Pharr	d9c0f9315a	Fix generic targets: half conversion functions weren't declared. (Broken by `1867b5b31`).	2012-01-27 14:44:43 -08:00
Matt Pharr	b7f17d435f	Fix crash in gather/scatter optimization pass.	2012-01-27 14:44:35 -08:00
Matt Pharr	37cdc18639	Issue error instead of crashing given attempted function call through non-function. Fixes issue #163.	2012-01-27 10:01:06 -08:00
Matt Pharr	5893a9c49d	Remove incorrect assert	2012-01-27 09:14:45 -08:00
Matt Pharr	24f58fa16a	Update per_lane macro to not use ID for lane number in macro expansion This was leading to unintended consequences if WIDTH was used in macro code, which was undesirable.	2012-01-27 09:12:13 -08:00
Matt Pharr	56ffc78fa4	Require semicolons after sync, assert, and print statements. (Silly parser oversight.)	2012-01-27 09:12:13 -08:00
Matt Pharr	061e68bc77	Fix compiler crash from malformed program.	2012-01-27 09:12:13 -08:00
Matt Pharr	177e6312b4	Fix build with LLVM ToT (ConstantVector::getVectorElements() is gone now).	2012-01-27 09:07:58 -08:00
Matt Pharr	1acf4032c2	Merge branch 'master' of https://github.com/jduprat/ispc	2012-01-26 14:18:25 -08:00
Jean-Luc Duprat	9c5444698e	run_tests.py fixes: - Python 3 fixes (can't use print) - Fixed for running tests on Windows	2012-01-26 13:39:54 -08:00
Matt Pharr	65f3252760	Various fixes to test running script for Windows. Also, removed the --valgrind option and replaced it with a more general --wrap-exe option, which can be used both for running Valgrind and SDE.	2012-01-26 10:56:29 -08:00
Matt Pharr	e612abe4ba	Fix parsing of 64-bit integer constants on Windows. (i.e., use the 64-bit unsigned integer parsing function, not the 64-bit signed one.) Fixes bug #68.	2012-01-26 10:56:28 -08:00
Jean-Luc Duprat	34352e4e0e	beefed up stdin.h on Windows so it compiles ispc 1.1.3	2012-01-25 15:04:19 -08:00
Matt Pharr	1867b5b317	Use native float/half conversion instructions with the AVX2 target.	2012-01-24 15:33:38 -08:00
Matt Pharr	a5b7fca7e0	Extract constant offsets from gather/scatter base+offsets offset vectors. When we're able to turn a general gather/scatter into the "base + offsets" form, we now try to extract out any constant components of the offsets and then pass them as a separate parameter to the gather/scatter function implementation. We then in turn carefully emit code for the addressing calculation so that these constant offsets match LLVM's patterns to detect this case, such that we get the constant offsets directly encoded in the instruction's addressing calculation in many cases, saving arithmetic instructions to do these calculations. Improves performance of stencil by ~15%. Other workloads unchanged.	2012-01-24 14:41:15 -08:00
Matt Pharr	7be2c399b1	Rename various optimization passes to have more descriptive names. No functionality change.	2012-01-23 14:49:48 -08:00
Matt Pharr	d6337b3b22	Code cleanups in opt.cpp; no functional change	2012-01-23 14:36:32 -08:00
Matt Pharr	d2f8b0ace5	Add __clock to list of symbols to make internal from builtins.	2012-01-23 06:19:16 -08:00
Matt Pharr	d805e8b183	Add clock() function to standard library. Also corrected the declaration of num_cores() to return a uniform value.	2012-01-22 13:05:27 -08:00
Matt Pharr	1f0f2ec05f	Include AVX2 in supported ISAs	2012-01-22 07:05:47 -08:00
Matt Pharr	91ac3b9d7c	Back out WIP changes to opt.cpp that were inadvertently checked in.	2012-01-21 07:34:53 -08:00
Matt Pharr	d65bf2eb2f	Doxygen number bump and release notes for 1.1.3 v1.1.3	2012-01-20 17:04:16 -08:00
Matt Pharr	1bba9d4307	Improve atomic_swap_global() to take advantage of associativity. We now do a single atomic hardware swap and then effectively do swaps between the running program instances such that the result is the same as if they had happened to run a particular ordering of hardware swaps themselves. Also cleaned up __atomic_swap_uniform_* built-in implementations to not take the mask, which they weren't using anyway. Finishes Issue #56.	2012-01-20 10:37:33 -08:00
Matt Pharr	4388338dad	Fix performance regression introduced in `be0c77d556` Effectively, the patterns that detected when given a gather or scatter in base+offsets form, the offsets were actually a multiple of 2/4/8, were no longer working. This change not only fixes this, but also expands the set of patterns that are matched by this. For example, given offsets of the form 4v1 + 16v2, it identifies a scale of 4 and new offsets of v1 + 4*v2. This fix makes the volume renderer run 1.19x faster, and noise 1.54x faster.	2012-01-19 17:57:59 -08:00
Matt Pharr	2fb59c90cf	Fix C++ backend bug introduced in `d14a2de168`. (This was causing a number of tests to fail with the generic targets.)	2012-01-19 11:35:02 -07:00
Matt Pharr	68f6ea8def	For << and >> with C++, detect when all instances are shifting by the same amount. In this case, we now emit calls to potentially-specialized functions for the left/right shifts that take a single integer value for the shift amount. These in turn can be matched to the corresponding intrinsics for the SSE target. Issue #145.	2012-01-19 10:04:32 -07:00
Matt Pharr	3f89295d10	Update RNG code in stdlib to use -> operator where appropriate.	2012-01-19 10:02:47 -07:00
Matt Pharr	748b292e77	Improve code for uniform switches with a 'break' under varying control flow. Previously, when we had a switch statement with a uniform switch condition but a 'break' statement that was under varying control flow inside the switch, we'd promote the switch condition to be varying so that the break would work correctly. Now, we leave the condition as uniform and are thus able to use the more-efficient LLVM switch instruction in this case. Issue #156.	2012-01-19 08:41:19 -07:00
Matt Pharr	6451c3d99d	Fix bug with code for initializers for static arrays in generated C++ code. (This was preventing aobench from compiling successfully with the generic target.)	2012-01-18 16:55:09 -07:00
Matt Pharr	d14a2de168	Fix generic code emission when building with LLVM3.0/2.9. Specifically, don't use vector select for masked store blend there, but emit a call to a undefined __masked_store_blend_*() functions. Added implementations of these functions to the sse4.h and generic-16.h in examples/instrinsics. (Calls to these will never be generated with LLVM 3.1).	2012-01-17 23:42:22 -07:00
Matt Pharr	642150095d	Include LLVM version used to build in version info printed out.	2012-01-17 23:42:22 -07:00
Matt Pharr	3bf3ac7922	Be more conservative about using blending in place of masked store. More specifically, we do a proper masked store (rather than a load- blend-store) unless we can determine that we're accessing a stack-allocated "varying" variable. This fixes a number of nefarious bugs where given code like: uniform float a[21]; foreach (i = 0 … 21) a[i] = 0; We'd use a blend and in turn read past the end of a[] in the last iteration. Also made slight changes to inlining in aobench; this keeps compiles to ~5s, versus ~45s without them (with this change). Fixes issue #160.	2012-01-17 23:42:22 -07:00
Matt Pharr	c6d1cebad4	Update masked_load/store implementations for generic targets to take void *s (Fixes compile errors when we try to actually use these!)	2012-01-17 23:42:22 -07:00
Matt Pharr	08189ce08c	Update "inline" qualifiers in a few examples.	2012-01-17 23:42:22 -07:00

1 2 3 4 5 ...

553 Commits