Commit Graph

250 Commits

Author SHA1 Message Date
Matt Pharr
f8f25a11b6 Added deferred shading workload 2011-09-30 19:42:14 -07:00
Matt Pharr
cb7976bbf6 Added updated task launch implementation that now tracks task groups.
Within each function that launches tasks, we now can easily track which
tasks that function launched, so that the sync at the end of the function
can just sync on the tasks launched by that function (not all tasks
launched by all functions.)

Implementing this led to a rework of the task system API that ispc generates
code to call; the example task systems in examples/tasksys.cpp have been
updated to conform to this API.  (The updated API is also documented in
the ispc user's guide.)

As part of this, "launch[n]" syntax was added to launch a number of tasks
in a single launch statement, rather than requiring a loop over 'n' to
launch n tasks.

This commit thus fixes issue #84 (enhancement to launch multiple tasks from
a single launch statement) as well as issue #105 (recursive task launches
were broken).
2011-09-30 11:20:53 -07:00
Matt Pharr
5ee4d7fce8 Add comment 2011-09-30 11:11:52 -07:00
Matt Pharr
8f3e46f67e Use InterlockedExchangeAdd on Windows 2011-09-29 16:19:59 -07:00
Matt Pharr
9ed07ff2b5 Fix __num_cores() definition on Windows to not cause unresolved symbols 2011-09-29 13:35:50 -07:00
Matt Pharr
32a0a30cf5 Only allow exact matches for function overload resolution for builtins.
The intent is that the code in stdlib.ispc that is calling out to the built-ins
  should match argument types exactly (using explicit casts as needed), just
  for maximal clarity/safety.
2011-09-28 17:20:31 -07:00
Matt Pharr
6d39d5fc3e Small cleanups.
Add __num_cores() to the list of symbols to remove from the module at the end.
Fix declarations of mask type for 64-bit atomics to silence warnings.
2011-09-28 16:26:35 -07:00
Matt Pharr
c999c8a237 Add num_cores() stdlib routine. Issue #102. 2011-09-28 16:16:58 -07:00
Matt Pharr
aad269fdf4 Added support for 'uniform' global atomics.
Issue #93.
2011-09-28 16:06:07 -07:00
Matt Pharr
d45c536c47 Fix Windows debug build of simple example 2011-09-28 14:11:32 -07:00
Matt Pharr
f1b8e5b1bf Release notes and doxygen bump for 1.0.9 release v1.0.9 2011-09-26 16:21:32 -07:00
Matt Pharr
e7a70b05af Fix statically-linked tests on Linux 2011-09-26 16:11:45 -07:00
Matt Pharr
cf73286938 More small Windows build fixes. Also switch to LLVM 3.0 libs 2011-09-26 16:07:23 -07:00
Matt Pharr
e6f80c0adc Remove stale include of MCJIT.h 2011-09-26 16:04:52 -07:00
Matt Pharr
5e31d7b6d0 Windows build: use LLVM_INSTALL_DIR to find clang.exe 2011-09-26 16:04:50 -07:00
Matt Pharr
649f2ad7b7 Update parser to make 'sync' a statement, not an expr. 2011-09-23 20:33:24 -07:00
Matt Pharr
fade1cdf1d Pretty much all conversions to varying double are slow, so don't bother warning about them. 2011-09-23 16:03:35 -07:00
Matt Pharr
d261105a86 Error/warning reporting improvements.
- Don't suggest matches when given an empty string or a single, non-alpha
  character.
- Also fixed the parser to be a bit less confusing when it encounters an
  unexpected EOF.
2011-09-23 15:51:23 -07:00
Matt Pharr
b3d3e8987b Provide a properly initialized TextDiagnosticPrinter to clang's preprocessor.
Fixes issue #100 (crash when the preprocessor was trying to emit a diagnostic
about a mismatched #if/#endif).
2011-09-23 15:50:18 -07:00
Matt Pharr
4e91f3777a Fix BinaryExpr to handle reference-typed operands.
Fixes issue #101.
2011-09-23 15:19:14 -07:00
Matt Pharr
5584240c7f Fix crash with function declarations with unnamed parameters.
Fixes issue #103.
Previously, we were inadvertently grabbing the function's return type
  for the parameter, rather than the actual parameter type.
2011-09-23 15:05:59 -07:00
Matt Pharr
7126a39092 Disable PIC on Windows 2011-09-19 15:32:43 -07:00
Matt Pharr
8ad28a3f6f update doxygen, release notes for 1.0.8 release v1.0.8 2011-09-19 15:22:25 -07:00
Matt Pharr
9921b8e530 Predicated 'if' statement performance improvements.
Go back to running both sides of 'if' statements with masking and without
branching if we can determine that the code is relatively simple (as per
the simple cost model), and is safe to run even if the mask is 'all off'.
This gives a bit of a performance improvement for some of the examples
(most notably, the ray tracer), and is the code that one wants generated
in this case anyhow.
2011-09-19 09:54:09 -07:00
Matt Pharr
9052d4b10b Linux build fixes 2011-09-17 13:42:46 -07:00
Matt Pharr
2405dae8e6 Use malloc() to get space for task arguments when compiling to AVX.
This is to work around the LLVM bug/limitation discused in LLVM bug
10841 (http://llvm.org/bugs/show_bug.cgi?id=10841).
2011-09-17 13:38:51 -07:00
Matt Pharr
3607f3e045 Remove support for building with LLVM 2.8. Fixes issue #66.
Both 2.9 and top-of-tree generate substantially better code than
LLVM 2.8 did, so it's not worth fixing the 2.8 build.
2011-09-17 13:18:59 -07:00
Matt Pharr
de84acfa5d On OSX with LLVM 2.9, always generate position-independent code.
Fixes Issue #99.
2011-09-17 13:03:51 -07:00
Matt Pharr
a501ab1aa6 Fix parenthesization bugs in cost estimates.
Also added the debugging print that helped find these issues.
Revert inlining some functions in examples
2011-09-16 19:07:07 -07:00
Matt Pharr
cdc850f98c Inline some functions in examples 2011-09-16 17:02:21 -07:00
Matt Pharr
ca87579f23 Add a very simple cost model to estimate runtime cost of running code.
This is currently only used to decide whether it's worth doing an
"are all lanes running" check at the start of functions--for small
functions, it's not worth the overhead.

The cost is estimated relatively early in compilation (e.g. before
we know if an array access is a scatter/gather or not, before
constant folding, etc.), so there are many known shortcomings.
2011-09-16 15:09:17 -07:00
Matt Pharr
38fc13d1ab Remove now unused function. 2011-09-16 14:21:13 -07:00
Matt Pharr
cf9d9f717e Logic simplification to 'mixed true/false' case for coherent ifs.
Use the approach from 173632f446 here as
well.
2011-09-16 14:10:55 -07:00
Matt Pharr
173632f446 Generate more efficient for regular varying 'if' statements.
For the case where we have a regular (i.e. non-'cif') 'if' statement,
the generated code just simply checks to see if any program instance
is running before running the corresponding statements.  This is a
lighter-weight check than IfStmt::emitMaskMixed() was performing.
2011-09-16 12:03:42 -07:00
Matt Pharr
1dedd88132 Improve implementaton of 'are both masks equal' check for AVX.
Previously, we did a vector equal compare and then a movmsk, the
result of which we checked to see if it was on for all lanes.
Because masks are vectors of i32s, under AVX, the vector equal
compare required two 4-wide SSE compares and some shuffling.
Now, we do a movmsk of both masks first and then a scalar
equality comparison of those two values, which seems to generate
overall better code.
2011-09-15 06:25:02 -07:00
Matt Pharr
0848c2cc19 Actually make all 'if' statements check for 'all off' mask.
Contrary to claims in 0c2048385, that checkin didn't include the changes
to not run if/else blocks if none of the program instances wanted to be
running them.  This checkin fixes that and thus actually fixes issue #74.
2011-09-13 19:48:04 -07:00
Matt Pharr
e2a88d491f Mark the internal __fast_masked_vload function as static 2011-09-13 15:43:48 -07:00
Matt Pharr
30f9dcd4f5 Unroll loops by default, add --opt=disable-loop-unroll to disable.
Issue #78.
2011-09-13 15:37:18 -07:00
Matt Pharr
0c344b6755 Fix Linux build of mandelbrot_tasks example 2011-09-13 15:17:30 -07:00
Matt Pharr
6734021520 Issue warning when compile-time constant out-of-bounds array index is used.
Issue #98.
Also fixes two examples that had bugs of this type that this warning
  uncovered!
2011-09-13 14:42:20 -07:00
Matt Pharr
dd153d3c5c Handle more instruction types when flattening offset vectors.
Generalize the lScalarizeVector() utility routine (used in determining
when we can change gathers/scatters into vector loads/stores, respectively)
to handle vector shuffles and vector loads.  This fixes issue #79, which
provided a case where a gather was being performed even though a vector
load was possible.
2011-09-13 09:43:56 -07:00
Matt Pharr
9ca7541d52 Remove check for any program instances running before function calls.
Given the change in 0c20483853, this is no longer necessary, since
we know that one instance will always be running if we're executing a
given block of code.
2011-09-13 06:26:16 -07:00
Matt Pharr
0c20483853 Make all "if" statements "coherent" ifs. Workaround for issue #74.
Using blend to do masked stores is unsafe if all of the lanes are off:
it may read from or write to invalid memory.  For now, this workaround
transforms all 'if' statements into coherent 'if's, ensuring that an
instruction only runs if at least on program instance wants to be running
it.

One nice thing about this change is that a number of implementations of
various builtins can be simplified, since they no longer need to confirm
that at least one program instance is running.

It might be nice to re-enable regular if statements in a future checkin,
but we'd want to make sure they don't have any masked loads or blended
masked stores in their statement lists.  There isn't a performance
impact for any of the examples with this change, so it's unclear if
this is important.

Note that this only impacts 'if' statements with a varying condition.
2011-09-12 16:25:08 -07:00
Matt Pharr
9d4ff1bc06 Fix alignment in usage message 2011-09-12 15:06:41 -07:00
Matt Pharr
83f22f1939 Add experimental --fast-masked-vload flag for SSE. 2011-09-12 12:29:33 -07:00
Matt Pharr
6375ed9224 AVX: Fix bug with misdeclaration of blend intrinsic.
This was preventing the "convert an all-on blend to one of the
  operand values" optimization from kicking on in AVX.
2011-09-12 06:42:38 -07:00
Matt Pharr
cf23cf9ef4 Fix typo in user guide. Issue #96 2011-09-12 05:24:32 -07:00
Matt Pharr
1147b53dcd Add #define with target vector width in emitted headers 2011-09-09 09:33:56 -07:00
Matt Pharr
4cf831a651 When --fast-math is enabled, tell LLVM about it, too. 2011-09-09 09:32:59 -07:00
Matt Pharr
785d8a29d3 Run mem2reg pass even when doing -O0 compiles 2011-09-09 09:24:43 -07:00