Commit Graph

282 Commits

Author SHA1 Message Date
Matt Pharr
a89e26d725 Improvements to mask management code; removes a number of unnecessary blends.
We now maintain a the distinction between the value of the mask passed into a
function and the "internal" mask within the function that only accounts for
varying control flow within the function.

The full mask (the AND of the function mask and the internal mask) must be used
for assignments to static and global variables, and reference function parameters.
Further, it is the appropriate mask to use for making decisions about varying
control flow.  However, we can use the internal mask for assignments to variables
declared in the current function (including the return value and non-reference
parameters to the function).  Doing so allows us to catch a few more cases where
the internal mask is all on, even if the mask coming into the function wasn't all
on, and thence use moves rather than blends for those assignments.  (Which in
turn can allow additional optimizations to happen.)

Fixes issue #23.
2011-10-10 11:47:19 -07:00
Matt Pharr
3cb0115dce Add routines to standard library to do efficient AOS/SOA conversions.
Currently, we just support 3 and 4-wide variants (i.e. xyzxyz.. and xyzwxyzw..),
for int32 and float types.
2011-10-10 10:56:06 -07:00
Matt Pharr
f5391747b9 Remove suggestions from parser that we support "char" 2011-10-10 10:54:06 -07:00
Matt Pharr
b8768ffdfa Merge branch 'master' of github.com:ispc/ispc 2011-10-07 20:39:44 -07:00
Matt Pharr
6009608bc6 Mark inlined functions as having static linkage 2011-10-07 16:06:14 -07:00
Matt Pharr
ce7355f9ed Windows: fix examples build to look for ispc.exe in ../.. as well 2011-10-09 07:40:18 -07:00
Matt Pharr
6b4459d402 Windows: fix some compiler warnings during build 2011-10-09 07:40:17 -07:00
Matt Pharr
790dba2558 Doxygen bump and release notes for v1.0.11 v1.0.11 2011-10-07 09:57:55 -07:00
Matt Pharr
4a2cbf2c4e Fix regression from AST checkin that caused perf. warnings to be issued for stdlib code. 2011-10-07 09:20:48 -07:00
Matt Pharr
53dd65fa2e Add ispc_test to buildall.bat script 2011-10-08 17:17:05 -07:00
Matt Pharr
f5afa52fd9 Add missing header 2011-10-06 17:10:30 -07:00
Matt Pharr
f9c67ff806 Explicit representation of ASTs for all the functions in a compile unit.
Added AST and Function classes.
Now, we parse the whole file and build up the AST for all of the
  functions in the Module before we emit IR for the functions (vs. before,
  when we generated IR along the way as we parsed the source file.)
2011-10-06 15:35:27 -07:00
Matt Pharr
ec5e627e56 Mark internal stdlib functions as "internal" linkage, not "private".
This fixes print() statements on OSX.
(http://llvm.org/bugs/show_bug.cgi?id=11080)
2011-10-06 13:32:20 -07:00
Matt Pharr
ff2a43ac19 Run the CFG simplification pass even when optimization is disabled.
This fixes an issue with undefined SVML symbols with code that called
transcendental functions in the stdandard library, even when the SVML
math library hadn't been selected.
2011-10-06 09:20:50 -07:00
Matt Pharr
9feea32471 Fix errors in documentation for some of the reduce_* stdlib functions 2011-10-06 07:52:10 -07:00
Matt Pharr
bedaec2295 Update examples for multi-target compilation.
Makefile and vcxproj file updates.
Also modified vcxproj files so that the various files ispc generates go into $(TargetDir),
  not the current directory.
Modified the ray tracer example to not have uniform short-vector types in its app-visible
  datatypes (these are laid out differently on SSE vs AVX); there was an existing lurking
  bug in the way this was done before.
2011-10-04 16:01:56 -07:00
Matt Pharr
a68d137df6 Documentation update for multi-target compilation. 2011-10-04 16:01:56 -07:00
Matt Pharr
59caa3d4e1 Various small Windows fixes.
Also fixed some tabs/spaces and compiler warning issues.
2011-10-04 16:01:56 -07:00
Matt Pharr
06975bc7ab Add support for compiling to multiple targets.
If a flag along the lines of "--target=sse4,avx-x2" is provided on the command-line,
then the program will be compiled for each of the given targets, with a separate
output file generated for each one.  Further, an output file with dispatch functions
that check the current system's CPU and then chooses the best available variant
is also created.

Issue #11.
2011-10-04 16:01:55 -07:00
Matt Pharr
880cbb18cc Remove checks to see if system's processor matches the target the code was compiled for.
(Preparation for multi-target output.)
2011-10-04 16:01:55 -07:00
Matt Pharr
686d9975b6 Add Symbol::exportedFunction member to hold llvm::Function * for app-callable version of function. 2011-10-04 15:56:54 -07:00
Matt Pharr
9b7f55a28e Add buildall.bat script for Windows. Also various example build fixes for Windows 2011-10-04 11:42:04 -07:00
Matt Pharr
e4d224a0f1 Use __cilk to detect Cilk support 2011-10-04 11:16:42 -07:00
Matt Pharr
0933a77c1b Improve task decomposition in ray tracing example.
Specifically, launch all of the tasks in one statement, rather than
still looping over spans in y and launching a collection of tasks
across x for each span.  This seems to give a few percent better
performance.
2011-10-04 09:33:59 -07:00
Matt Pharr
5f78edf07a Fix bug with screen decomposition in volume rendering example 2011-10-04 09:30:02 -07:00
Matt Pharr
a6fc657b40 Remove 'externGlobals' member from Module; instead find them when needed via new SymbolTable::GetMatchingVariables method. 2011-10-04 06:36:31 -07:00
Matt Pharr
fa5050d5c7 Error reporting improvements.
Don't print more than 3 lines of source file context with errors.
  (Any more than that is almost certainly not the Right Thing to do.)
Make some parsing error messages more clear.
2011-10-03 21:09:04 -07:00
Matt Pharr
d5a48d9a1e Fix incorrect LLVM_3_0svn #ifdefs 2011-10-03 08:29:19 -07:00
Matt Pharr
2df9da2524 Be careful to not inadvertently match NULL functions in optimization passes. 2011-10-01 08:34:11 -07:00
Matt Pharr
0b02f94988 Task system performance tweaks.
Switch back to GCD on OSX.
Increase TaskInfo allocation count.
This fixes the regression with deferred on AVX (from 17x to 25x
  again with 4 cores.)
2011-10-01 08:04:09 -07:00
Matt Pharr
65c50b60fc Cleanups to deferred shading workload 2011-09-30 20:35:42 -07:00
Matt Pharr
9de34eb22c Release notes and doxygen bump for v1.0.10 2011-09-30 19:42:14 -07:00
Matt Pharr
f8f25a11b6 Added deferred shading workload 2011-09-30 19:42:14 -07:00
Matt Pharr
cb7976bbf6 Added updated task launch implementation that now tracks task groups.
Within each function that launches tasks, we now can easily track which
tasks that function launched, so that the sync at the end of the function
can just sync on the tasks launched by that function (not all tasks
launched by all functions.)

Implementing this led to a rework of the task system API that ispc generates
code to call; the example task systems in examples/tasksys.cpp have been
updated to conform to this API.  (The updated API is also documented in
the ispc user's guide.)

As part of this, "launch[n]" syntax was added to launch a number of tasks
in a single launch statement, rather than requiring a loop over 'n' to
launch n tasks.

This commit thus fixes issue #84 (enhancement to launch multiple tasks from
a single launch statement) as well as issue #105 (recursive task launches
were broken).
2011-09-30 11:20:53 -07:00
Matt Pharr
5ee4d7fce8 Add comment 2011-09-30 11:11:52 -07:00
Matt Pharr
8f3e46f67e Use InterlockedExchangeAdd on Windows 2011-09-29 16:19:59 -07:00
Matt Pharr
9ed07ff2b5 Fix __num_cores() definition on Windows to not cause unresolved symbols 2011-09-29 13:35:50 -07:00
Matt Pharr
32a0a30cf5 Only allow exact matches for function overload resolution for builtins.
The intent is that the code in stdlib.ispc that is calling out to the built-ins
  should match argument types exactly (using explicit casts as needed), just
  for maximal clarity/safety.
2011-09-28 17:20:31 -07:00
Matt Pharr
6d39d5fc3e Small cleanups.
Add __num_cores() to the list of symbols to remove from the module at the end.
Fix declarations of mask type for 64-bit atomics to silence warnings.
2011-09-28 16:26:35 -07:00
Matt Pharr
c999c8a237 Add num_cores() stdlib routine. Issue #102. 2011-09-28 16:16:58 -07:00
Matt Pharr
aad269fdf4 Added support for 'uniform' global atomics.
Issue #93.
2011-09-28 16:06:07 -07:00
Matt Pharr
d45c536c47 Fix Windows debug build of simple example 2011-09-28 14:11:32 -07:00
Matt Pharr
f1b8e5b1bf Release notes and doxygen bump for 1.0.9 release v1.0.9 2011-09-26 16:21:32 -07:00
Matt Pharr
e7a70b05af Fix statically-linked tests on Linux 2011-09-26 16:11:45 -07:00
Matt Pharr
cf73286938 More small Windows build fixes. Also switch to LLVM 3.0 libs 2011-09-26 16:07:23 -07:00
Matt Pharr
e6f80c0adc Remove stale include of MCJIT.h 2011-09-26 16:04:52 -07:00
Matt Pharr
5e31d7b6d0 Windows build: use LLVM_INSTALL_DIR to find clang.exe 2011-09-26 16:04:50 -07:00
Matt Pharr
649f2ad7b7 Update parser to make 'sync' a statement, not an expr. 2011-09-23 20:33:24 -07:00
Matt Pharr
fade1cdf1d Pretty much all conversions to varying double are slow, so don't bother warning about them. 2011-09-23 16:03:35 -07:00
Matt Pharr
d261105a86 Error/warning reporting improvements.
- Don't suggest matches when given an empty string or a single, non-alpha
  character.
- Also fixed the parser to be a bit less confusing when it encounters an
  unexpected EOF.
2011-09-23 15:51:23 -07:00