Add non-short-circuiting and(), or(), select() to stdlib.
This commit is contained in:
200
docs/ispc.rst
200
docs/ispc.rst
@@ -121,10 +121,14 @@ Contents:
|
|||||||
|
|
||||||
* `The ISPC Standard Library`_
|
* `The ISPC Standard Library`_
|
||||||
|
|
||||||
|
+ `Basic Operations On Data`_
|
||||||
|
|
||||||
|
* `Logical and Selection Operations`_
|
||||||
|
* `Bit Operations`_
|
||||||
|
|
||||||
+ `Math Functions`_
|
+ `Math Functions`_
|
||||||
|
|
||||||
* `Basic Math Functions`_
|
* `Basic Math Functions`_
|
||||||
* `Bit-Level Operations`_
|
|
||||||
* `Transcendental Functions`_
|
* `Transcendental Functions`_
|
||||||
* `Pseudo-Random Numbers`_
|
* `Pseudo-Random Numbers`_
|
||||||
|
|
||||||
@@ -2150,6 +2154,12 @@ greater than or equal to ``NUM_ITEMS``.
|
|||||||
// ...
|
// ...
|
||||||
}
|
}
|
||||||
|
|
||||||
|
Short-circuiting may impose some overhead in the generated code; for cases
|
||||||
|
where short-circuiting is undesirable due to performance impact, see
|
||||||
|
the section `Logical and Selection Operations`_, which introduces helper
|
||||||
|
functions in the standard library that provide these operations without
|
||||||
|
short-circuiting.
|
||||||
|
|
||||||
|
|
||||||
Dynamic Memory Allocation
|
Dynamic Memory Allocation
|
||||||
-------------------------
|
-------------------------
|
||||||
@@ -2827,6 +2837,123 @@ The ISPC Standard Library
|
|||||||
compiling ``ispc`` programs. (To disable the standard library, pass the
|
compiling ``ispc`` programs. (To disable the standard library, pass the
|
||||||
``--nostdlib`` command-line flag to the compiler.)
|
``--nostdlib`` command-line flag to the compiler.)
|
||||||
|
|
||||||
|
Basic Operations On Data
|
||||||
|
------------------------
|
||||||
|
|
||||||
|
Logical and Selection Operations
|
||||||
|
--------------------------------
|
||||||
|
|
||||||
|
Recall from `Expressions`_ that ``ispc`` short-circuits the evaluation of
|
||||||
|
logical and selection operators: given an expression like ``(index < count
|
||||||
|
&& array[index] == 0)``, then ``array[index] == 0`` is only evaluated if
|
||||||
|
``index < count`` is true. This property is useful for writing expressions
|
||||||
|
like the preceeding one, where the second expression may not be safe to
|
||||||
|
evaluate in some cases.
|
||||||
|
|
||||||
|
This short-circuiting can impose overhead in the generated code; additional
|
||||||
|
operations are required to test the first value and to conditionally jump
|
||||||
|
over the code that evaluates the second value. The ``ispc`` compiler does
|
||||||
|
try to mitigate this cost by detecting cases where it is both safe and
|
||||||
|
inexpensive to evaluate both expressions, and skips short-circuiting in the
|
||||||
|
generated code in this case (without there being any programmer-visible
|
||||||
|
change in program behavior.)
|
||||||
|
|
||||||
|
For cases where the compiler can't detect this case but the programmer
|
||||||
|
wants to avoid short-circuiting behavior, the standard library provides a
|
||||||
|
few helper functions. First, ``and()`` and ``or()`` provide
|
||||||
|
non-short-circuiting logical AND and OR operations.
|
||||||
|
|
||||||
|
::
|
||||||
|
|
||||||
|
bool and(bool a, bool b)
|
||||||
|
bool or(bool a, bool b)
|
||||||
|
uniform bool and(uniform bool a, uniform bool b)
|
||||||
|
uniform bool or(uniform bool a, uniform bool b)
|
||||||
|
|
||||||
|
And there are three variants of ``select()`` that select between two values
|
||||||
|
based on a boolean condition. These are the variants of ``select()`` for
|
||||||
|
the ``int8`` type:
|
||||||
|
|
||||||
|
::
|
||||||
|
|
||||||
|
int8 select(bool v, int8 a, int8 b)
|
||||||
|
int8 select(uniform bool v, int8 a, int8 b)
|
||||||
|
uniform int8 select(uniform bool v, uniform int8 a, uniform int8 b)
|
||||||
|
|
||||||
|
There are also variants for ``int16``, ``int32``, ``int64``, ``float``, and
|
||||||
|
``double`` types.
|
||||||
|
|
||||||
|
Bit Operations
|
||||||
|
--------------
|
||||||
|
|
||||||
|
The various variants of ``popcnt()`` return the population count--the
|
||||||
|
number of bits set in the given value.
|
||||||
|
|
||||||
|
::
|
||||||
|
|
||||||
|
uniform int popcnt(uniform int v)
|
||||||
|
int popcnt(int v)
|
||||||
|
uniform int popcnt(bool v)
|
||||||
|
|
||||||
|
|
||||||
|
A few functions determine how many leading bits in the given value are zero
|
||||||
|
and how many of the trailing bits are zero; there are also ``unsigned``
|
||||||
|
variants of these functions and variants that take ``int64`` and ``unsigned
|
||||||
|
int64`` types.
|
||||||
|
|
||||||
|
::
|
||||||
|
|
||||||
|
int32 count_leading_zeros(int32 v)
|
||||||
|
uniform int32 count_leading_zeros(uniform int32 v)
|
||||||
|
int32 count_trailing_zeros(int32 v)
|
||||||
|
uniform int32 count_trailing_zeros(uniform int32 v)
|
||||||
|
|
||||||
|
Sometimes it's useful to convert a ``bool`` value to an integer using sign
|
||||||
|
extension so that the integer's bits are all on if the ``bool`` has the
|
||||||
|
value ``true`` (rather than just having the value one). The
|
||||||
|
``sign_extend()`` functions provide this functionality:
|
||||||
|
|
||||||
|
::
|
||||||
|
|
||||||
|
int sign_extend(bool value)
|
||||||
|
uniform int sign_extend(uniform bool value)
|
||||||
|
|
||||||
|
The ``intbits()`` and ``floatbits()`` functions can be used to implement
|
||||||
|
low-level floating-point bit twiddling. For example, ``intbits()`` returns
|
||||||
|
an ``unsigned int`` that is a bit-for-bit copy of the given ``float``
|
||||||
|
value. (Note: it is **not** the same as ``(int)a``, but corresponds to
|
||||||
|
something like ``*((int *)&a)`` in C.
|
||||||
|
|
||||||
|
::
|
||||||
|
|
||||||
|
float floatbits(unsigned int a);
|
||||||
|
uniform float floatbits(uniform unsigned int a);
|
||||||
|
unsigned int intbits(float a);
|
||||||
|
uniform unsigned int intbits(uniform float a);
|
||||||
|
|
||||||
|
|
||||||
|
The ``intbits()`` and ``floatbits()`` functions have no cost at runtime;
|
||||||
|
they just let the compiler know how to interpret the bits of the given
|
||||||
|
value. They make it possible to efficiently write functions that take
|
||||||
|
advantage of the low-level bit representation of floating-point values.
|
||||||
|
|
||||||
|
For example, the ``abs()`` function in the standard library is implemented
|
||||||
|
as follows:
|
||||||
|
|
||||||
|
::
|
||||||
|
|
||||||
|
float abs(float a) {
|
||||||
|
unsigned int i = intbits(a);
|
||||||
|
i &= 0x7fffffff;
|
||||||
|
return floatbits(i);
|
||||||
|
}
|
||||||
|
|
||||||
|
This code directly clears the high order bit to ensure that the given
|
||||||
|
floating-point value is positive. This compiles down to a single ``andps``
|
||||||
|
instruction when used with an Intel® SSE target, for example.
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
Math Functions
|
Math Functions
|
||||||
--------------
|
--------------
|
||||||
|
|
||||||
@@ -2919,77 +3046,6 @@ quite efficient.)
|
|||||||
uniform unsigned int low,
|
uniform unsigned int low,
|
||||||
uniform unsigned int high)
|
uniform unsigned int high)
|
||||||
|
|
||||||
Bit-Level Operations
|
|
||||||
--------------------
|
|
||||||
|
|
||||||
|
|
||||||
The various variants of ``popcnt()`` return the population count--the
|
|
||||||
number of bits set in the given value.
|
|
||||||
|
|
||||||
::
|
|
||||||
|
|
||||||
uniform int popcnt(uniform int v)
|
|
||||||
int popcnt(int v)
|
|
||||||
uniform int popcnt(bool v)
|
|
||||||
|
|
||||||
|
|
||||||
A few functions determine how many leading bits in the given value are zero
|
|
||||||
and how many of the trailing bits are zero; there are also ``unsigned``
|
|
||||||
variants of these functions and variants that take ``int64`` and ``unsigned
|
|
||||||
int64`` types.
|
|
||||||
|
|
||||||
::
|
|
||||||
|
|
||||||
int32 count_leading_zeros(int32 v)
|
|
||||||
uniform int32 count_leading_zeros(uniform int32 v)
|
|
||||||
int32 count_trailing_zeros(int32 v)
|
|
||||||
uniform int32 count_trailing_zeros(uniform int32 v)
|
|
||||||
|
|
||||||
Sometimes it's useful to convert a ``bool`` value to an integer using sign
|
|
||||||
extension so that the integer's bits are all on if the ``bool`` has the
|
|
||||||
value ``true`` (rather than just having the value one). The
|
|
||||||
``sign_extend()`` functions provide this functionality:
|
|
||||||
|
|
||||||
::
|
|
||||||
|
|
||||||
int sign_extend(bool value)
|
|
||||||
uniform int sign_extend(uniform bool value)
|
|
||||||
|
|
||||||
The ``intbits()`` and ``floatbits()`` functions can be used to implement
|
|
||||||
low-level floating-point bit twiddling. For example, ``intbits()`` returns
|
|
||||||
an ``unsigned int`` that is a bit-for-bit copy of the given ``float``
|
|
||||||
value. (Note: it is **not** the same as ``(int)a``, but corresponds to
|
|
||||||
something like ``*((int *)&a)`` in C.
|
|
||||||
|
|
||||||
::
|
|
||||||
|
|
||||||
float floatbits(unsigned int a);
|
|
||||||
uniform float floatbits(uniform unsigned int a);
|
|
||||||
unsigned int intbits(float a);
|
|
||||||
uniform unsigned int intbits(uniform float a);
|
|
||||||
|
|
||||||
|
|
||||||
The ``intbits()`` and ``floatbits()`` functions have no cost at runtime;
|
|
||||||
they just let the compiler know how to interpret the bits of the given
|
|
||||||
value. They make it possible to efficiently write functions that take
|
|
||||||
advantage of the low-level bit representation of floating-point values.
|
|
||||||
|
|
||||||
For example, the ``abs()`` function in the standard library is implemented
|
|
||||||
as follows:
|
|
||||||
|
|
||||||
::
|
|
||||||
|
|
||||||
float abs(float a) {
|
|
||||||
unsigned int i = intbits(a);
|
|
||||||
i &= 0x7fffffff;
|
|
||||||
return floatbits(i);
|
|
||||||
}
|
|
||||||
|
|
||||||
This code directly clears the high order bit to ensure that the given
|
|
||||||
floating-point value is positive. This compiles down to a single ``andps``
|
|
||||||
instruction when used with an Intel® SSE target, for example.
|
|
||||||
|
|
||||||
|
|
||||||
Transcendental Functions
|
Transcendental Functions
|
||||||
------------------------
|
------------------------
|
||||||
|
|
||||||
|
|||||||
119
stdlib.ispc
119
stdlib.ispc
@@ -746,6 +746,125 @@ static inline void prefetch_nt(const void * varying ptr) {
|
|||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
|
///////////////////////////////////////////////////////////////////////////
|
||||||
|
// non-short-circuiting alternatives
|
||||||
|
|
||||||
|
__declspec(safe,cost1)
|
||||||
|
static inline bool and(bool a, bool b) {
|
||||||
|
return a && b;
|
||||||
|
}
|
||||||
|
|
||||||
|
__declspec(safe,cost1)
|
||||||
|
static inline uniform bool and(uniform bool a, uniform bool b) {
|
||||||
|
return a && b;
|
||||||
|
}
|
||||||
|
|
||||||
|
__declspec(safe,cost1)
|
||||||
|
static inline bool or(bool a, bool b) {
|
||||||
|
return a || b;
|
||||||
|
}
|
||||||
|
|
||||||
|
__declspec(safe,cost1)
|
||||||
|
static inline uniform bool or(uniform bool a, uniform bool b) {
|
||||||
|
return a || b;
|
||||||
|
}
|
||||||
|
|
||||||
|
__declspec(safe,cost1)
|
||||||
|
static inline int8 select(bool c, int8 a, int8 b) {
|
||||||
|
return c ? a : b;
|
||||||
|
}
|
||||||
|
|
||||||
|
__declspec(safe,cost1)
|
||||||
|
static inline int8 select(uniform bool c, int8 a, int8 b) {
|
||||||
|
return c ? a : b;
|
||||||
|
}
|
||||||
|
|
||||||
|
__declspec(safe,cost1)
|
||||||
|
static inline uniform int8 select(uniform bool c, uniform int8 a,
|
||||||
|
uniform int8 b) {
|
||||||
|
return c ? a : b;
|
||||||
|
}
|
||||||
|
|
||||||
|
__declspec(safe,cost1)
|
||||||
|
static inline int16 select(bool c, int16 a, int16 b) {
|
||||||
|
return c ? a : b;
|
||||||
|
}
|
||||||
|
|
||||||
|
__declspec(safe,cost1)
|
||||||
|
static inline int16 select(uniform bool c, int16 a, int16 b) {
|
||||||
|
return c ? a : b;
|
||||||
|
}
|
||||||
|
|
||||||
|
__declspec(safe,cost1)
|
||||||
|
static inline uniform int16 select(uniform bool c, uniform int16 a,
|
||||||
|
uniform int16 b) {
|
||||||
|
return c ? a : b;
|
||||||
|
}
|
||||||
|
|
||||||
|
__declspec(safe,cost1)
|
||||||
|
static inline int32 select(bool c, int32 a, int32 b) {
|
||||||
|
return c ? a : b;
|
||||||
|
}
|
||||||
|
|
||||||
|
__declspec(safe,cost1)
|
||||||
|
static inline int32 select(uniform bool c, int32 a, int32 b) {
|
||||||
|
return c ? a : b;
|
||||||
|
}
|
||||||
|
|
||||||
|
__declspec(safe,cost1)
|
||||||
|
static inline uniform int32 select(uniform bool c, uniform int32 a,
|
||||||
|
uniform int32 b) {
|
||||||
|
return c ? a : b;
|
||||||
|
}
|
||||||
|
|
||||||
|
__declspec(safe,cost1)
|
||||||
|
static inline int64 select(bool c, int64 a, int64 b) {
|
||||||
|
return c ? a : b;
|
||||||
|
}
|
||||||
|
|
||||||
|
__declspec(safe,cost1)
|
||||||
|
static inline int64 select(uniform bool c, int64 a, int64 b) {
|
||||||
|
return c ? a : b;
|
||||||
|
}
|
||||||
|
|
||||||
|
__declspec(safe,cost1)
|
||||||
|
static inline uniform int64 select(uniform bool c, uniform int64 a,
|
||||||
|
uniform int64 b) {
|
||||||
|
return c ? a : b;
|
||||||
|
}
|
||||||
|
|
||||||
|
__declspec(safe,cost1)
|
||||||
|
static inline float select(bool c, float a, float b) {
|
||||||
|
return c ? a : b;
|
||||||
|
}
|
||||||
|
|
||||||
|
__declspec(safe,cost1)
|
||||||
|
static inline float select(uniform bool c, float a, float b) {
|
||||||
|
return c ? a : b;
|
||||||
|
}
|
||||||
|
|
||||||
|
__declspec(safe,cost1)
|
||||||
|
static inline uniform float select(uniform bool c, uniform float a,
|
||||||
|
uniform float b) {
|
||||||
|
return c ? a : b;
|
||||||
|
}
|
||||||
|
|
||||||
|
__declspec(safe,cost1)
|
||||||
|
static inline double select(bool c, double a, double b) {
|
||||||
|
return c ? a : b;
|
||||||
|
}
|
||||||
|
|
||||||
|
__declspec(safe,cost1)
|
||||||
|
static inline double select(uniform bool c, double a, double b) {
|
||||||
|
return c ? a : b;
|
||||||
|
}
|
||||||
|
|
||||||
|
__declspec(safe,cost1)
|
||||||
|
static inline uniform double select(uniform bool c, uniform double a,
|
||||||
|
uniform double b) {
|
||||||
|
return c ? a : b;
|
||||||
|
}
|
||||||
|
|
||||||
///////////////////////////////////////////////////////////////////////////
|
///////////////////////////////////////////////////////////////////////////
|
||||||
// Horizontal ops / reductions
|
// Horizontal ops / reductions
|
||||||
|
|
||||||
|
|||||||
Reference in New Issue
Block a user