Both uniform and varying function pointers are supported; when a function
is called through a varying function pointer, each unique function pointer
value across the running program instances is called once for the set of
active program instances that want to call it.
Be better about tracking the full extent of expressions in the parser;
this leads to more intelligible error messages when we indicate where
exactly the error happened.
Previously, it was only in the GatherScatterFlattenOpt optimization pass that
we added the per-lane offsets when we were indexing into varying data.
(Specifically, the case of float foo[]; int index; foo[index], where foo
is an array of varying elements rather than uniform elements.) Now, this
is done in the front-end as we're first emitting code.
In addition to the basic ugliness of doing this in an optimization pass,
it was also error-prone to do it there, since we no longer have access
to all of the type information that's around in the front-end.
No functionality or performance change.
The Expr::TypeConv() method has been replaced with both a
CanConvertTypes() routine that indicates whether one type
can be converted to another and a TypeConvertExpr()
routine that provides the same functionality as
Expr::TypeConv() used to.
This code previously lived in FunctionCallExpr but is now part
of FunctionSymbolExpr. This change doesn't change any current
functionality, but lays groundwork for function pointers in
the language, where we'll want to do function call overload
resolution at other times besides when a function call is
actually being made.
Specifically, we had been using the full mask for all gathers, rather than
using the internal mask when we were loading from locally-declared arrays.
Thus, given code like:
uniform float x[programCount] = { .. . };
float xx = x[programIndex];
Previously we weren't generating a plain vector load to initialize xx, when
this code was in a function where it wasn't known that the mask was all on,
even though it should have. Now it does.
It's not clear that these are actually all that helpful.
This also works around issue #89, wherein code like "int8 = 0" would
give a warning about conversion from int32 to int8.
Generalize the overload resolution code to be based on estimating a
cost for various overload options and picking the one with the
minimal cost.
Add a step that considers type conversions that are guaranteed to
not lose information in function overload resolution.
Print better diagnostics when we can't find an unambiguous match.
We now maintain a the distinction between the value of the mask passed into a
function and the "internal" mask within the function that only accounts for
varying control flow within the function.
The full mask (the AND of the function mask and the internal mask) must be used
for assignments to static and global variables, and reference function parameters.
Further, it is the appropriate mask to use for making decisions about varying
control flow. However, we can use the internal mask for assignments to variables
declared in the current function (including the return value and non-reference
parameters to the function). Doing so allows us to catch a few more cases where
the internal mask is all on, even if the mask coming into the function wasn't all
on, and thence use moves rather than blends for those assignments. (Which in
turn can allow additional optimizations to happen.)
Fixes issue #23.
Within each function that launches tasks, we now can easily track which
tasks that function launched, so that the sync at the end of the function
can just sync on the tasks launched by that function (not all tasks
launched by all functions.)
Implementing this led to a rework of the task system API that ispc generates
code to call; the example task systems in examples/tasksys.cpp have been
updated to conform to this API. (The updated API is also documented in
the ispc user's guide.)
As part of this, "launch[n]" syntax was added to launch a number of tasks
in a single launch statement, rather than requiring a loop over 'n' to
launch n tasks.
This commit thus fixes issue #84 (enhancement to launch multiple tasks from
a single launch statement) as well as issue #105 (recursive task launches
were broken).
The intent is that the code in stdlib.ispc that is calling out to the built-ins
should match argument types exactly (using explicit casts as needed), just
for maximal clarity/safety.
Go back to running both sides of 'if' statements with masking and without
branching if we can determine that the code is relatively simple (as per
the simple cost model), and is safe to run even if the mask is 'all off'.
This gives a bit of a performance improvement for some of the examples
(most notably, the ray tracer), and is the code that one wants generated
in this case anyhow.
This is currently only used to decide whether it's worth doing an
"are all lanes running" check at the start of functions--for small
functions, it's not worth the overhead.
The cost is estimated relatively early in compilation (e.g. before
we know if an array access is a scatter/gather or not, before
constant folding, etc.), so there are many known shortcomings.
Given the change in 0c20483853, this is no longer necessary, since
we know that one instance will always be running if we're executing a
given block of code.
Fixes issue #73. Previously, if we had e.g. an int16 type that was being shifted
left by 1, then the constant integer 1 would come in as an int32, we'd convert
the int16 to an int32, and then we'd do the shift. Now, for shifts, the type
of the expression is always the same as the type of the value being shifted.
This commit adds support for swizzles like "foo.zy" (if "foo" is,
for example, a float<3> type) as rvalues. (Still need support for
swizzles as lvalues.)
This way, we match C/C++ in that casting a bool to an int gives either the value
zero or the value one. There is a new stdlib function int sign_extend(bool)
that does sign extension for cases where that's desired.