On a target with a 16-bit mask (for example), we would choose the type
of an integer literal "1024" to be an int16. Previously, we used an int32,
which is a worse fit and leads to less efficient code than an int16
on a 16-bit mask target. (However, we'd still give an integer literal
1000000 the type int32, even in a 16-bit target.)
Updated the tests to still pass with 8 and 16-bit targets, given this
change.
For varying int8/16/32 types, divides by small constants can be
implemented efficiently through multiplies and shifts with integer
types of twice the bit-width; this commit adds this optimization.
(Implementation is based on Halide.)
All of these pass the current mask to FunctionEmitContext::SetBlockEntryMask()
so that when a break/continue/return is encountered, it can test to see if all
lanes have followed that path and then return; this in turn ensures that we never
run statements with an all-off execution mask.
These functions were passing the function internal mask, not the full mask, and
thus could end up executing code with the mask all off if some lanes were
disabled by an outer function. (The new tests test this case.)
In particular, this gives us desired behavior for NaNs (all compares
involving a NaN evaluate to true). This in turn allows writing the
canonical isnan() function as "v != v".
Added isnan() to the standard library as well.
Previously, we were trying to take a uniform seed and then shuffle that
around to initialize the state for each of the program instances. This
was becoming increasingly untenable and brittle.
Now a varying seed is expected and used.
These tests all fail with generic-16/c++ output currently; however, the
output indicates that it's just small floating-point differences.
(Though the question remains, why are those differences popping up?)
Now a declaration like 'struct Foo;' can be used to establish the
name of a struct type, without providing a definition. One can
pass pointers to such types around the system, but can't do much
else with them (as in C/C++).
Issue #125.
The decl.* code now no longer interacts with Symbols, but just returns
names, types, initializer expressions, etc., as needed. This makes the
code a bit more understandable.
Fixes issues #171 and #130.
Once we're down to something that's not another nested expr list, use
TypeConvertExpr() to convert the expression to the type we need. This should
allow simplifying a number of the GetConstant() implementations, to remove
partial reimplementation of type conversion there.
For now, this change finishes off issue #220.
Previously, the compiler would crash if e.g. the program passed a
temporary value to a function taking a const reference. This change
fixes ReferenceExpr::GetValue() to handle this case and allocate
temporary storage for the temporary so that the pointer to that
storage can be used for the reference value.
Closer compatibility with C++: given a non-reference type, treat matching
to a non-const reference of that type as a better match than a const
reference of that type (rather than both being equal cost).
Issue #224.
Implicit conversion to function types is now a more standard part of
the type conversion infrastructure, rather than special cases of things
like FunctionSymbolExpr immediately returning a pointer type, etc.
Improved AddressOfExpr::TypeCheck() to actually issue errors in cases
where it's illegal to take the address of an expression.
Added AddressOfExpr::GetConstant() implementation that handles taking
the address of functions.
Issue #223.
We were incorrectly trying to type convert the varying offset to a
uniform value, which in turn led to an incorrect compile-time error.
Fixes issue #201.
In particular, we 1. weren't setting the function mask to 'all on', such that
any mixed function mask would in turn apply inside the foreach loop, and 2.
weren't always setting the internal mask to 'all on' before doing any additional
masking based on the iteration variables.
In particular, LLVMVectorIsLinear() and LLVMVectorValuesAllEqual() are able
to reason a bit about the effects of the shifts and the ANDs that are
generated from SOA indexing calculations, so that they can detect more cases
where a linear sequence of locations are in fact being accessed in
the presence of SOA data.
Now, the pointed-to type is always uniform by default (if an explicit
rate qualifier isn't provided). This rule is easier to remember and
seems to work well in more cases than the previous rule from 6d7ff7eba2.
Now, if a struct member has an explicit 'uniform' or 'varying'
qualifier, then that member has that variability, regardless of
the variability of the struct's variability. Members without
'uniform' or 'varying' have unbound variability, and in turn
inherit the variability of the struct.
As a result of this, now structs can properly be 'varying' by default,
just like all the other types, while still having sensible semantics.