Added AST and Function classes.
Now, we parse the whole file and build up the AST for all of the
functions in the Module before we emit IR for the functions (vs. before,
when we generated IR along the way as we parsed the source file.)
This fixes an issue with undefined SVML symbols with code that called
transcendental functions in the stdandard library, even when the SVML
math library hadn't been selected.
Makefile and vcxproj file updates.
Also modified vcxproj files so that the various files ispc generates go into $(TargetDir),
not the current directory.
Modified the ray tracer example to not have uniform short-vector types in its app-visible
datatypes (these are laid out differently on SSE vs AVX); there was an existing lurking
bug in the way this was done before.
If a flag along the lines of "--target=sse4,avx-x2" is provided on the command-line,
then the program will be compiled for each of the given targets, with a separate
output file generated for each one. Further, an output file with dispatch functions
that check the current system's CPU and then chooses the best available variant
is also created.
Issue #11.
Specifically, launch all of the tasks in one statement, rather than
still looping over spans in y and launching a collection of tasks
across x for each span. This seems to give a few percent better
performance.
Don't print more than 3 lines of source file context with errors.
(Any more than that is almost certainly not the Right Thing to do.)
Make some parsing error messages more clear.
Within each function that launches tasks, we now can easily track which
tasks that function launched, so that the sync at the end of the function
can just sync on the tasks launched by that function (not all tasks
launched by all functions.)
Implementing this led to a rework of the task system API that ispc generates
code to call; the example task systems in examples/tasksys.cpp have been
updated to conform to this API. (The updated API is also documented in
the ispc user's guide.)
As part of this, "launch[n]" syntax was added to launch a number of tasks
in a single launch statement, rather than requiring a loop over 'n' to
launch n tasks.
This commit thus fixes issue #84 (enhancement to launch multiple tasks from
a single launch statement) as well as issue #105 (recursive task launches
were broken).
The intent is that the code in stdlib.ispc that is calling out to the built-ins
should match argument types exactly (using explicit casts as needed), just
for maximal clarity/safety.
- Don't suggest matches when given an empty string or a single, non-alpha
character.
- Also fixed the parser to be a bit less confusing when it encounters an
unexpected EOF.
Go back to running both sides of 'if' statements with masking and without
branching if we can determine that the code is relatively simple (as per
the simple cost model), and is safe to run even if the mask is 'all off'.
This gives a bit of a performance improvement for some of the examples
(most notably, the ray tracer), and is the code that one wants generated
in this case anyhow.