aaron/ispc - ispc - git.frat.tech

aaron/ispc

Author	SHA1	Message	Date
Dmitry Babokin	524939dc5b	Fix for issue #430	2013-02-27 18:03:07 +04:00
Matt Pharr	0bf1320a32	Remove support for building with LLVM 3.0	2013-01-06 12:27:53 -08:00
Matt Pharr	63dd7d9859	Fix build to work with LLVM top-of-tree again	2013-01-06 12:02:08 -08:00
Matt Pharr	8cbfde6092	Small fixes to build with LLVM top-of-tree (now numbered as version 3.3)	2012-12-02 14:29:24 -08:00
Matt Pharr	172a189c6f	Fix build with LLVM top-of-tree	2012-10-17 11:11:50 -07:00
Matt Pharr	be2108260e	Add --opt=force-aligned-memory option. This forces all vector loads/stores to be done assuming that the given pointer is aligned to the vector size, thus allowing the use of sometimes more-efficient instructions. (If it isn't the case that the memory is aligned, the program will fail!).	2012-09-14 13:49:45 -07:00
Matt Pharr	19d8f2e258	Generate FMA instructions with AVX2 (when possible). Issue #320.	2012-08-03 10:43:41 -07:00
Matt Pharr	0bb4d282e2	Add sys/types.h include for linux/osx.	2012-07-23 08:32:41 -07:00
Matt Pharr	e9fe9f5043	Add cpu strings for Ivy Bridge and HSW. Default to avx2 ISA for HSW CPUs.	2012-07-23 08:24:18 -07:00
Matt Pharr	51210a869b	Support core-avx-i and core-avx2 CPU types. (And map them to avx1.1 and avx2 targets, respectively.)	2012-07-19 10:15:59 -07:00
Matt Pharr	6a410fc30e	Emit gather instructions for the AVX2 targets. Issue #308.	2012-07-13 12:29:05 -07:00
Matt Pharr	98b2e0e426	Fixes for intrinsics unsupported in earlier LLVM versions. Specifically, don't use the half/float conversion routines with LLVM 3.0, and don't try to use RDRAND with anything before LLVM 3.2.	2012-07-13 12:14:10 -07:00
Matt Pharr	371d4be8ef	Fix bugs in detection of Ivy Bridge systems. We were incorrectly characterizing them as basic AVX1 without further extensions, due to a bug in the logic to check CPU features.	2012-07-12 14:11:15 -07:00
Matt Pharr	216ac4b1a4	Stop factoring out constant offsets for gather/scatter if instr is available. For KNC (gather/scatter), it's not helpful to factor base+offsets gathers and scatters into base_ptr + {1/2/4/8} * varying_offsets + const_offsets. Now, if a HW instruction is available for gather/scatter, we just factor into base + {1/2/4/8} * offsets (if possible). Not only is this simpler, but it's also what we need to pass a value along to the scale by 2/4/8 available directly in those instructions. Finishes issue #325.	2012-07-11 14:52:29 -07:00
Matt Pharr	10b79fb41b	Add support for non-factored variants of gather/scatter functions. We now have two ways of approaching gather/scatters with a common base pointer and with offset vectors. For targets with native gather/scatter, we just turn those into base + {1/2/4/8}offsets. For targets without, we turn those into base + {1/2/4/8}varying_offsets + const_offsets, where const_offsets is a compile-time constant. Infrastructure for issue #325.	2012-07-11 14:29:42 -07:00
Matt Pharr	aabbdba068	Switch a few remaining fprintf() calls to use Warning()/Error().	2012-07-06 12:56:45 -07:00
Matt Pharr	4186ef204d	Fix build with LLVM top of tree.	2012-07-05 13:35:01 -07:00
Nicolas Trangez	3a007f939a	Build: Include unistd.h where required Some modules require an include of unistd.h (e.g. for getcwd and isatty definitions). These changes were required to build successfully on a Fedora 17 system, using GCC 4.7.0 & glibc-headers 2.15.	2012-07-04 14:49:00 +02:00
Matt Pharr	f38770bf2a	Fix build with LLVM ToT	2012-06-28 07:36:10 -07:00
Matt Pharr	40a295e951	Fix bug where "avx-x2" target would cause AVX1.1 to be used.	2012-06-12 13:37:38 -07:00
Matt Pharr	6c7df4cb6b	Add initial support for "avx1.1" targets for Ivy Bridge. So far, only the use of the float/half conversion instructions distinguishes this from the "avx1" target. Partial work on issue #263.	2012-06-08 15:55:00 -07:00
Matt Pharr	1397dbdabc	Don't generate colorized output escapes when stderr isn't a TTY. When piping to a pile, more/less, etc, this is generally undesirable. This behavior can be overridden with the --colorized-output command-line flag.	2012-06-04 09:20:57 -07:00
Matt Pharr	449d956966	Add support for generic-64 target.	2012-05-25 11:57:28 -07:00
Matt Pharr	72c41f104e	Fix various malformed program crashes.	2012-05-18 10:44:45 -07:00
Matt Pharr	299ae186f1	Expect support for half and transcendentals from all generic targets	2012-05-18 06:13:45 -07:00
Matt Pharr	fbed0ac56b	Remove allOffMaskIsSafe from Target The intent of this was to indicate whether it was safe to run code with an 'all of' mask on the given target (and then sometimes be more flexible about e.g. running both true and false blocks of if statements, etc.) The problem is that even if the architecture has full native mask support, it's still not safe to run 'uniform' memory operations with the mask all off. Even more tricky, we sometimes transform masked varying memory operations to uniform ones during optimization (e.g. gather->load and broadcast). This fixes a number of the tests/switch-* tests that were failing on the generic targets due to this issue.	2012-05-09 14:18:47 -07:00
Matt Pharr	0c1b206185	Pass log/exp/pow transcendentals through to targets that support them. Currently, this is the generic targets.	2012-05-03 13:49:56 -07:00
Matt Pharr	d99bd279e8	Add generic-32 target.	2012-05-03 11:11:06 -07:00
Matt Pharr	ee1fe3aa9f	Update build to handle existence of LLVM 3.2 dev branch. We now compile with LLVM 3.0, 3.1, and 3.2svn.	2012-05-03 08:25:25 -07:00
Matt Pharr	d5cc2ad643	Call Verify() methods of various debugging llvm::DI* types after creation.	2012-04-25 08:43:11 -10:00
Matt Pharr	fefa86e0cf	Remove LLVM_TYPE_CONST #define / usage. Now with LLVM 3.0 and beyond, types aren't const.	2012-04-15 20:11:27 -07:00
Matt Pharr	972043c146	Fix serious bug in handling constant-valued initializers. In InitSymbol(), we try to be smart and emit a memcpy when there are a number of values to store (e.g. for arrays, structs, etc.) Unfortunately, this wasn't working as desired for bools (i.e. i1 types), since the SizeOf() call that tried to figure out how many bytes to copy would return 0 bytes, due to dividing the number of bits to copy by 8. Fixes issue #234.	2012-04-09 14:23:08 -07:00
Matt Pharr	b813452d33	Don't issue a slew of warnings if a bogus cpu type is specified. Issue #221.	2012-04-03 06:13:28 -07:00
Matt Pharr	560bf5ca09	Updated logic for selecting target ISA when not specified. Now, if the user specified a CPU then we base the ISA choice on that--only if no CPU and no target is specified do we use the CPUID-based check to pick a vector ISA. Improvement to fix to #205.	2012-03-30 16:36:12 -07:00
Matt Pharr	b3c5043dcc	Don't enable llvm's UnsafeFPMath option when --opt=fast-math is supplied. This was causing functions like round() to fail on SSE2, since it has code that does: x += 0x1.0p23f; x -= 0x1.0p23f; which was in turn being undesirably optimized away. Fixes issue #211.	2012-03-28 10:26:39 -07:00
Matt Pharr	3270e2bf5a	Call CPUID to more reliably detect level of SSE/AVX that the host supports. Fixes, I hope, issue #205.	2012-03-28 09:20:06 -07:00
Matt Pharr	73bf552cd6	Add support for coalescing memory accesses from gathers. There are two related optimizations that happen now. (These currently only apply for gathers where the mask is known to be all on, and to gathers that are accessing 32-bit sized elements, but both of these may be generalized in the future.) First, for any single gather, we are now more flexible in mapping it to individual memory operations. Previously, we would only either map it to a general gather (one scalar load per SIMD lane), or an unaligned vector load (if the program instances could be determined to be accessing a sequential set of locations in memory.) Now, we are able to break gathers into scalar, 2-wide (i.e. 64-bit), 4-wide, or 8-wide loads. Further, we now generate code that shuffles these loads around. Doing fewer, larger loads in this manner, when possible, can be more efficient. Second, we can coalesce memory accesses across multiple gathers. If we have a series of gathers without any memory writes in the middle, then we try to analyze their reads collectively and choose an efficient set of loads for them. Not only does this help if different gathers reuse values from the same location in memory, but it's specifically helpful when data with AOS layout is being accessed; in this case, we're often able to generate wide vector loads and appropriate shuffles automatically.	2012-02-10 13:10:39 -08:00
Matt Pharr	3efbc71a01	Add fuzz testing of input programs. When the --fuzz-test command-line option is given, the input program will be randomly perturbed by the lexer in an effort to trigger assertions or crashes in the compiler (neither of which should ever happen, even for malformed programs.)	2012-02-06 15:34:47 -08:00
Matt Pharr	724a843bbd	Add --quiet option to supress all diagnostic output	2012-02-06 12:39:09 -08:00
Gabe Weisz	c67a286aa6	Add support for 1-wide scalar target. Issue #40.	2012-01-29 06:36:07 -08:00
Matt Pharr	1867b5b317	Use native float/half conversion instructions with the AVX2 target.	2012-01-24 15:33:38 -08:00
Matt Pharr	2fb59c90cf	Fix C++ backend bug introduced in `d14a2de168`. (This was causing a number of tests to fail with the generic targets.)	2012-01-19 11:35:02 -07:00
Matt Pharr	0ed11a7832	Compute SizeOf() and OffsetOf() at compile time in more cases with the generic target. Really, we only have to be careful about the case where there is a vector of bools (i.e. a mask) involved, since the size of that isn't known at compile-time. (Currently, at least.)	2012-01-08 14:06:44 -08:00
Matt Pharr	4151778f5e	Modify SizeOf() and StructOffset() to not compute value based on target for generic targets. Specifically, we want to be able to late-bind on whether the mask is i32s or i1s, so if there's any chance of ambiguity, we emit code that does the "GEP from a NULL base pointer" trick to compute the value later in compilation.	2012-01-04 12:59:03 -08:00
Matt Pharr	1d9201fe3d	Add "generic" 4, 8, and 16-wide targets. When used, these targets end up with calls to undefined functions for all of the various special vector stuff ispc needs to compile ispc programs (masked store, gather, min/max, sqrt, etc.). These targets are not yet useful for anything, but are a step toward having an option to C++ code with calls out to intrinsics. Reorganized the directory structure a bit and put the LLVM bitcode used to define target-specific stuff (as well as some generic built-ins stuff) into a builtins/ directory. Note that for building on Windows, it's now necessary to set a LLVM_VERSION environment variable (with values like LLVM_2_9, LLVM_3_0, LLVM_3_1svn, etc.)	2011-12-19 13:46:50 -08:00
Matt Pharr	f30a5dea79	Linux build fixes	2011-12-15 12:23:26 -08:00
Matt Pharr	e82a720223	Fix various warnings / build issues on Windows	2011-12-15 12:06:38 -08:00
Matt Pharr	8d1b77b235	Have assertion macro and FATAL() text ask user to file a bug, provide URL to do so. Switch to Assert() from assert() to make it clear it's not the C stdlib one we're using any more.	2011-12-15 11:11:16 -08:00
Matt Pharr	46bfef3fce	Add option to turn off codegen improvements when mask 'all on' is statically known.	2011-12-11 16:16:36 -08:00
Matt Pharr	9475e13d81	Issue a warning if no output file is specified.	2011-12-06 08:21:34 -08:00

1 2

83 Commits