Commit Graph

1131 Commits

Author SHA1 Message Date
Matt Pharr
49dde7c6f2 Fix bug in declaration of double-precision sqrt intrinsic for AVX targets.
This was preventing sqrts of uniform double values from being compiled
properly.

Issue #344.
2012-08-03 11:43:31 -07:00
Matt Pharr
765a0d8896 Use puts() rather than printf() for printing assertion failure strings.
This way, we don't lose '%'s in the assertion strings.

Issue #342.
2012-08-03 11:31:38 -07:00
Matt Pharr
19d8f2e258 Generate FMA instructions with AVX2 (when possible).
Issue #320.
2012-08-03 10:43:41 -07:00
Matt Pharr
e6aec96e05 Fix build with LLVM top-of-tree 2012-08-03 09:59:41 -07:00
Jean-Luc Duprat
a2d42c3242 KNC: all masked_load_*() and masked_store_*() functions need to do unaligned accesses 2012-08-01 14:37:25 -07:00
Jean-Luc Duprat
52836aae87 Minor documentation clarrification on the impact of ICC -fp-model except option. 2012-08-01 10:24:35 -07:00
Matt Pharr
bda566d6a7 Fix incorrect assertion 2012-08-01 08:11:32 -07:00
Jean-Luc Duprat
63ed90b0fd docs/build.sh runs rst2html rather than rst2html.py
Explicitly documented that fact that ICC needs the -mmic flag to compile for KNC.
Updated ISPC User Guide with details on ICC compiler options that impact FP performance in generated code.
2012-07-30 11:47:25 -07:00
Matt Pharr
0bb4d282e2 Add sys/types.h include for linux/osx. 2012-07-23 08:32:41 -07:00
Matt Pharr
ae89a65dad Fix bug that caused unterminated basic blocks.
Issue #339.
2012-07-23 08:24:18 -07:00
Matt Pharr
e9fe9f5043 Add cpu strings for Ivy Bridge and HSW.
Default to avx2 ISA for HSW CPUs.
2012-07-23 08:24:18 -07:00
Matt Pharr
ce8dc5927c Fix bug in FunctionEmitContext::MatchIntegerTypes
Cause of issue #329.
2012-07-20 10:05:17 -07:00
Matt Pharr
f6989cce38 Disallow native output with generic targets, C++ output with non-generic targets.
Also wrote FAQs about why this is the way it is.
Issue #334.
2012-07-20 09:55:50 -07:00
Jean-Luc Duprat
6dbbf9aa80 Merge branch 'master' of https://github.com/ispc/ispc 2012-07-19 17:33:00 -07:00
Jean-Luc Duprat
fe6282e837 Fixed small issue with name mangling introduced in aecd6e08 2012-07-19 17:32:49 -07:00
Matt Pharr
51210a869b Support core-avx-i and core-avx2 CPU types.
(And map them to avx1.1 and avx2 targets, respectively.)
2012-07-19 10:15:59 -07:00
Matt Pharr
658652a9ff Merge pull request #331 from jduprat/master
New templated API for __setzero() __undef() and __smear()
2012-07-18 16:39:38 -07:00
Jean-Luc Duprat
aecd6e0878 All the smear(), setzero() and undef() APIs are now templated on the return type.
Modified ISPC's internal mangling to pass these through unchanged.
Tried hard to make sure this is not going to introduce an ABI change.
2012-07-17 17:06:36 -07:00
Jean-Luc Duprat
1334a84861 Merge branch 'master' of https://github.com/ispc/ispc 2012-07-17 11:46:30 -07:00
Matt Pharr
6a410fc30e Emit gather instructions for the AVX2 targets.
Issue #308.
2012-07-13 12:29:05 -07:00
Matt Pharr
984a68c3a9 Rename gen_gather() macro to gen_gather_factored() 2012-07-13 12:24:12 -07:00
Matt Pharr
daf5aa8e8b Run inst combine before memory optimizations.
We were previously emitting 64-bit indexing for some gathers where
32-bit was actually fine, due to some adds of constant vectors
that hadn't been simplified to the result.
2012-07-13 12:14:53 -07:00
Matt Pharr
98b2e0e426 Fixes for intrinsics unsupported in earlier LLVM versions.
Specifically, don't use the half/float conversion routines with
LLVM 3.0, and don't try to use RDRAND with anything before LLVM 3.2.
2012-07-13 12:14:10 -07:00
Matt Pharr
9a1932eaf7 Only set gcc's "-msse4.2", etc, option when compiling for generic targets.
We don't need it when ispc is just generating an object file directly, and gcc
on OS X doesn't recognize -mavx.
2012-07-13 12:02:05 -07:00
Matt Pharr
371d4be8ef Fix bugs in detection of Ivy Bridge systems.
We were incorrectly characterizing them as basic AVX1 without further
extensions, due to a bug in the logic to check CPU features.
2012-07-12 14:11:15 -07:00
Matt Pharr
d180031ef0 Add more tests of basic gather functionality. 2012-07-12 14:05:38 -07:00
Jean-Luc Duprat
e09e953bbb Added a few functions: __setzero_i64() __cast_sext(__vec16_i64, __vec16_i32), __cast_zext(__vec16_i32)
__min_varying_in32(), __min_varying_uint32(), __max_varying_int32(), __max_varying_uint32()
Fixed the signature of __smear_i64() to match current codegen
2012-07-12 10:32:38 -07:00
Matt Pharr
2c640f7e52 Add support for RDRAND in IvyBridge.
The standard library now provides a variety of rdrand() functions
that call out to RDRAND, when available.

Issue #263.
2012-07-12 06:07:07 -07:00
Matt Pharr
2bacebb1fb Doc fixes (Crystal Lemire). 2012-07-11 19:51:28 -07:00
Jean-Luc Duprat
df18b2a150 Fixed missing tmp var needed for use with gather intrinsic 2012-07-11 15:43:11 -07:00
Matt Pharr
216ac4b1a4 Stop factoring out constant offsets for gather/scatter if instr is available.
For KNC (gather/scatter), it's not helpful to factor base+offsets gathers
and scatters into base_ptr + {1/2/4/8} * varying_offsets + const_offsets.
Now, if a HW instruction is available for gather/scatter, we just factor
into base + {1/2/4/8} * offsets (if possible).  Not only is this simpler,
but it's also what we need to pass a value along to the scale by
2/4/8 available directly in those instructions.

Finishes issue #325.
2012-07-11 14:52:29 -07:00
Jean-Luc Duprat
898cded646 Merge branch 'master' of https://github.com/ispc/ispc
Conflicts:
	examples/intrinsics/knc.h
2012-07-11 14:45:00 -07:00
Matt Pharr
c09c87873e Whitespace / indentation fixes. 2012-07-11 14:29:46 -07:00
Matt Pharr
10b79fb41b Add support for non-factored variants of gather/scatter functions.
We now have two ways of approaching gather/scatters with a common base
pointer and with offset vectors.  For targets with native gather/scatter,
we just turn those into base + {1/2/4/8}*offsets.  For targets without,
we turn those into base + {1/2/4/8}*varying_offsets + const_offsets,
where const_offsets is a compile-time constant.

Infrastructure for issue #325.
2012-07-11 14:29:42 -07:00
Matt Pharr
ec0280be11 Rename gather/scatter_base_offsets functions to *factored_based_offsets*.
No functional change; just preparation for having a path that doesn't
factor the offsets into constant and varying parts, which will be better
for AVX2 and KNC.
2012-07-11 14:16:39 -07:00
Matt Pharr
8e19d54e75 Merge pull request #328 from jduprat/explicit_isa_in_tests
Explicit isa in tests
2012-07-10 20:49:37 -07:00
Jean-Luc Duprat
3c070e5e20 run_tests.py will only attempt to use the -mmic flag when the knc.h header is used 2012-07-10 17:07:56 -07:00
Jean-Luc Duprat
dde599f48f run_tests.py now picks the ISA via a -m flag based on the target selected, rather than always picking -msse4.2;
this is needed because -msse4.2 is not supported on KNC.
2012-07-10 16:39:18 -07:00
Jean-Luc Duprat
cc15ecfb3a Merge branch 'master' of https://github.com/ispc/ispc
Conflicts:
	cbackend.cpp
	examples/intrinsics/generic-16.h
	examples/intrinsics/generic-32.h
	examples/intrinsics/generic-64.h
	examples/intrinsics/knc.h
	examples/intrinsics/sse4.h
2012-07-10 16:36:08 -07:00
Jean-Luc Duprat
7a7c54bd59 Minor fixes to knc.h that resulted from integrating bea88ab122 2012-07-10 16:10:48 -07:00
Jean-Luc Duprat
bea88ab122 Integrated changes from mmp/and-fold-opt:
Add peephole optimization to eliminate some mask AND operations.

On KNC, the various vector comparison instructions can optionally
be masked; if a mask is provided, the result is effectively that
the value returned is the AND of the mask with the result of the
comparison.

This change adds an optimization pass to the C++ backend that looks
for vector ANDs where one operand is a comparison and rewrites
them--e.g. "and(equalfloat(a, b), c)" is changed to
"_equal_float_and_mask(a, b, c)", saving an instruction in the end.

Issue #319.

Merge commit '8ef6bc16364d4c08aa5972141748110160613087'

Conflicts:
	examples/intrinsics/knc.h
	examples/intrinsics/sse4.h
2012-07-10 10:33:24 -07:00
Matt Pharr
926b3b9ee3 Fix bugs with mask-handling for switch/do/for/while statements.
All of these pass the current mask to FunctionEmitContext::SetBlockEntryMask()
so that when a break/continue/return is encountered, it can test to see if all
lanes have followed that path and then return; this in turn ensures that we never
run statements with an all-off execution mask.

These functions were passing the function internal mask, not the full mask, and
thus could end up executing code with the mask all off if some lanes were
disabled by an outer function.  (The new tests test this case.)
2012-07-09 15:13:30 -07:00
Matt Pharr
bc7775aef2 Fix __ordered and _unordered floating point functions for C++ target.
Fixes include adding "_float" and "_double" suffixes as appropriate as well
as providing a number of missing implementations.

This fixes a number of failures in the half* tests.
2012-07-09 14:35:51 -07:00
Matt Pharr
107669686c Fix naming of some comparison ops in knc.h 2012-07-09 12:43:15 -07:00
Matt Pharr
bb11b3ab66 Fix build with LLVM 3.0 2012-07-09 10:45:36 -07:00
Jean-Luc Duprat
516ba85abd Merge pull request #322 from mmp/vector-constants
Vector constants
2012-07-09 09:28:26 -07:00
Jean-Luc Duprat
098277b4f0 Merge pull request #321 from mmp/setzero
More varied support for constant vectors from C++ backend.
2012-07-09 08:57:05 -07:00
Matt Pharr
950a989744 Add test that was supposed to go with 080241b7d1 2012-07-09 08:21:15 -07:00
Matt Pharr
fb8b893b10 Fix incorrect LLVM_3_1svn tests.
1. For some time now, we provide the version without the 'svn'
2. We should be testing "not LLVM 3.0" in these cases, since they
   apply to LLVM 3.2 and beyond as well...
2012-07-09 07:09:25 -07:00
Matt Pharr
9ca80debb8 Remove stale LLVM 2.9 support from builtins/util.m4 2012-07-09 06:54:29 -07:00