On a target with a 16-bit mask (for example), we would choose the type
of an integer literal "1024" to be an int16. Previously, we used an int32,
which is a worse fit and leads to less efficient code than an int16
on a 16-bit mask target. (However, we'd still give an integer literal
1000000 the type int32, even in a 16-bit target.)
Updated the tests to still pass with 8 and 16-bit targets, given this
change.
Add much more suppport for doubles and in64 types in the standard library, basically supporting everything for them that are supported for floats and int32s. (The notable exceptions being the approximate rcp() and rsqrt() functions, which don't really have sensible analogs for doubles (or at least not built-in instructions).)