aaron/ispc - ispc - git.frat.tech

aaron/ispc

Author	SHA1	Message	Date
Matt Pharr	d6337b3b22	Code cleanups in opt.cpp; no functional change	2012-01-23 14:36:32 -08:00
Matt Pharr	d2f8b0ace5	Add __clock to list of symbols to make internal from builtins.	2012-01-23 06:19:16 -08:00
Matt Pharr	d805e8b183	Add clock() function to standard library. Also corrected the declaration of num_cores() to return a uniform value.	2012-01-22 13:05:27 -08:00
Matt Pharr	1f0f2ec05f	Include AVX2 in supported ISAs	2012-01-22 07:05:47 -08:00
Matt Pharr	91ac3b9d7c	Back out WIP changes to opt.cpp that were inadvertently checked in.	2012-01-21 07:34:53 -08:00
Matt Pharr	d65bf2eb2f	Doxygen number bump and release notes for 1.1.3 v1.1.3	2012-01-20 17:04:16 -08:00
Matt Pharr	1bba9d4307	Improve atomic_swap_global() to take advantage of associativity. We now do a single atomic hardware swap and then effectively do swaps between the running program instances such that the result is the same as if they had happened to run a particular ordering of hardware swaps themselves. Also cleaned up __atomic_swap_uniform_* built-in implementations to not take the mask, which they weren't using anyway. Finishes Issue #56.	2012-01-20 10:37:33 -08:00
Matt Pharr	4388338dad	Fix performance regression introduced in `be0c77d556` Effectively, the patterns that detected when given a gather or scatter in base+offsets form, the offsets were actually a multiple of 2/4/8, were no longer working. This change not only fixes this, but also expands the set of patterns that are matched by this. For example, given offsets of the form 4v1 + 16v2, it identifies a scale of 4 and new offsets of v1 + 4*v2. This fix makes the volume renderer run 1.19x faster, and noise 1.54x faster.	2012-01-19 17:57:59 -08:00
Matt Pharr	2fb59c90cf	Fix C++ backend bug introduced in `d14a2de168`. (This was causing a number of tests to fail with the generic targets.)	2012-01-19 11:35:02 -07:00
Matt Pharr	68f6ea8def	For << and >> with C++, detect when all instances are shifting by the same amount. In this case, we now emit calls to potentially-specialized functions for the left/right shifts that take a single integer value for the shift amount. These in turn can be matched to the corresponding intrinsics for the SSE target. Issue #145.	2012-01-19 10:04:32 -07:00
Matt Pharr	3f89295d10	Update RNG code in stdlib to use -> operator where appropriate.	2012-01-19 10:02:47 -07:00
Matt Pharr	748b292e77	Improve code for uniform switches with a 'break' under varying control flow. Previously, when we had a switch statement with a uniform switch condition but a 'break' statement that was under varying control flow inside the switch, we'd promote the switch condition to be varying so that the break would work correctly. Now, we leave the condition as uniform and are thus able to use the more-efficient LLVM switch instruction in this case. Issue #156.	2012-01-19 08:41:19 -07:00
Matt Pharr	6451c3d99d	Fix bug with code for initializers for static arrays in generated C++ code. (This was preventing aobench from compiling successfully with the generic target.)	2012-01-18 16:55:09 -07:00
Matt Pharr	d14a2de168	Fix generic code emission when building with LLVM3.0/2.9. Specifically, don't use vector select for masked store blend there, but emit a call to a undefined __masked_store_blend_*() functions. Added implementations of these functions to the sse4.h and generic-16.h in examples/instrinsics. (Calls to these will never be generated with LLVM 3.1).	2012-01-17 23:42:22 -07:00
Matt Pharr	642150095d	Include LLVM version used to build in version info printed out.	2012-01-17 23:42:22 -07:00
Matt Pharr	3bf3ac7922	Be more conservative about using blending in place of masked store. More specifically, we do a proper masked store (rather than a load- blend-store) unless we can determine that we're accessing a stack-allocated "varying" variable. This fixes a number of nefarious bugs where given code like: uniform float a[21]; foreach (i = 0 … 21) a[i] = 0; We'd use a blend and in turn read past the end of a[] in the last iteration. Also made slight changes to inlining in aobench; this keeps compiles to ~5s, versus ~45s without them (with this change). Fixes issue #160.	2012-01-17 23:42:22 -07:00
Matt Pharr	c6d1cebad4	Update masked_load/store implementations for generic targets to take void *s (Fixes compile errors when we try to actually use these!)	2012-01-17 23:42:22 -07:00
Matt Pharr	08189ce08c	Update "inline" qualifiers in a few examples.	2012-01-17 23:42:22 -07:00
Matt Pharr	7013d7d52f	Small documentation updates and cleanups	2012-01-17 23:42:21 -07:00
Matt Pharr	7045b76f84	Improvements to code generation for "foreach" Specialize the code for the innermost loop to not do any masking computations for the innermost dimension for the iterations where we are certainly working on a full vector's worth of data. This fix improves performance/code quality of "foreach" such that it's essentially the same as the equivalent "for" loop. Fixes issue #151.	2012-01-17 11:34:00 -08:00
Matt Pharr	58a0b4a20d	Add separate set of builtins for AVX2. (i.e., stop just reusing the ones for AVX1). For now the only difference is that the int/uint min/max functions call the new intrinsic for that. Once gather is available from LLVM, that will go here as well.	2012-01-13 14:40:01 -08:00
Matt Pharr	0f8eee9809	Fix cases in optimization code to not inadvertently match calls to func ptrs. If we call a function pointer, CallInst::getCalledFunction() returns NULL; we need to be careful about this case when we're matching various function calls in optimization passes. (Fixes a crash.)	2012-01-12 10:33:06 -08:00
Matt Pharr	0740299860	Fix switch test	2012-01-12 09:45:31 -08:00
Matt Pharr	652215861e	Update dynamic target dispatch code to support AVX2.	2012-01-12 08:37:18 -08:00
Matt Pharr	602209e5a8	Tiny updates to documentation, comment for switch stuff.	2012-01-12 05:55:42 -08:00
Matt Pharr	b60f8b4f70	Fix merge conflicts	2012-01-11 17:13:51 -08:00
Matt Pharr	b67446d998	Add support for "switch" statements. Switches with both uniform and varying "switch" expressions are supported. Switch statements with varying expressions and very large numbers of labels may not perform well; some issues to be filed shortly will track opportunities for improving these.	2012-01-11 09:16:31 -08:00
Matt Pharr	9670ab0887	Add missing cases to watch out for in lCheckAllOffSafety() Previously, we weren't checking for member expressions that dereferenced a pointer or pointer dereference expressions--only array indexing!	2012-01-11 09:16:31 -08:00
Matt Pharr	0223bb85ee	Fix bug in StmtList::EmitCode() Previously, we would return immediately if the current basic block was NULL; however, this is the wrong thing to do in that goto labels and case/default labels in switch statements will establish a new current basic block even if the current one is NULL.	2012-01-11 09:14:39 -08:00
Jean-Luc Duprat	fd81255db1	Removed mutex support for OSX 10.5 Allow to run from the build directory even if it is not on the path properly decode subprocess stdout/stderr as UTF-8 Added newlines that were mistakenly left out of print->sys.stdout.wriote() conversion in previous CL Python 3: - fixed error message comparison - explicit list creation Windows: - forward/back slash annoyances - added stdint.h with definitions for int32_t, int64_t - compile_error_files and run_error_files were being appended to improperly	2012-01-10 16:55:00 -08:00
Matt Pharr	8a8e1a7f73	Fix bug with multiple EmitCode() calls due to missing braces. In short, we were inadvertently trying to emit each function's code a second time if the function had a mask check at the start of it. StmtList::EmitCode() was covering this error up by not emitting code if the current basic block is NULL.	2012-01-10 16:50:13 -08:00
Jean-Luc Duprat	ef05fbf424	run_tests.py more compatible with python 3.x except for the mutex class...	2012-01-10 13:12:38 -08:00
Jean-Luc Duprat	fa01b63fa5	Remove assumption that . is in the PATH in run_tests.py	2012-01-10 11:41:08 -08:00
Jean-Luc Duprat	63d3d25030	Fixed off by one error in array size generated by bitcode2cpp.py	2012-01-10 11:22:13 -08:00
Jean-Luc Duprat	a8db866228	Python build compatible on both python 2 and 3	2012-01-10 10:42:15 -08:00
Jean-Luc Duprat	0519eea951	Makefile does not hardcode link paths on Linux Link statically for both x86 and x86-64	2012-01-10 10:34:57 -08:00
Matt Pharr	f4653ecd11	Release notes for 1.1.2 and doxygen version number bump v1.1.2	2012-01-09 16:05:40 -08:00
Jean-Luc Duprat	5d67252ed0	Python scripts now compatible with both 2.x and 3.x releases of python	2012-01-09 13:56:05 -08:00
Matt Pharr	5134de71c0	Fix Windows build (inttypes.h not available)	2012-01-09 09:05:20 -08:00
Matt Pharr	2be1251c70	Fix Makefile on OSX (uname -o not supported)	2012-01-09 07:40:47 -08:00
Matt Pharr	c0161aa17f	Merge pull request #154 from palacaze/mingw Mingw support	2012-01-09 07:37:02 -08:00
Pierre-Antoine Lacaze	b683aa11b1	Fix linking under mingw, libdl is Linux only.	2012-01-09 10:52:46 +01:00
Pierre-Antoine Lacaze	2654bb0112	Handle python installations in non-standards locations.	2012-01-09 10:29:54 +01:00
Pierre-Antoine Lacaze	d8728104b4	Handle the case whereby BUILD_DATE is already defined.	2012-01-09 10:29:16 +01:00
Pierre-Antoine Lacaze	0be1b70fba	Mingw has strtoull, make use of it.	2012-01-09 10:28:52 +01:00
Pierre-Antoine Lacaze	a0e9793de3	Shut up warning wrt CONSOLE_SCREEN_BUFFER_INFO initialization	2012-01-09 10:19:46 +01:00
Pierre-Antoine Lacaze	da9200fcee	Fix alloca use on mingw.	2012-01-09 10:19:09 +01:00
Pierre-Antoine Lacaze	54e8e8022b	suppress warnings about long long arguments	2012-01-09 10:18:39 +01:00
Pierre-Antoine Lacaze	d84cf781da	Mingw does not have sysconf, use the msc way of finding processors.	2012-01-09 09:45:40 +01:00
Pierre-Antoine Lacaze	002f27a30f	Implement vasprintf and asprintf for platforms lacking them.	2012-01-09 09:44:58 +01:00

1 2 3 4 5 ...

521 Commits