Commit Graph

78 Commits

Author SHA1 Message Date
Ilia Filippov
ead5cc741d support LLVM trunk after 203559 203213 and 203381 revisions 2014-03-12 12:58:50 +04:00
Dmitry Babokin
f280b32fa4 Merge pull request #736 from egaburov/native_trigonometry
Native trigonometry
2014-02-20 19:18:35 +03:00
Dmitry Babokin
ea0a514e03 Fix for generic-1 2014-02-11 15:33:23 +04:00
Vsevolod Livinskij
65d947e449 Else branch with error report was added 2014-02-10 15:18:48 +04:00
Vsevolod Livinskij
cef5b2eb04 Some changes in saturation arithmetic 2014-02-10 12:40:53 +04:00
Vsevolod Livinskij
1c1614d207 Some errors in comments and code were fixed 2014-02-09 21:39:42 +04:00
Evghenii
70a9b286e5 added support for native and double precision trigonometry/transendentals 2014-02-07 15:28:39 +01:00
evghenii
09e8381ec7 change {rsqrt,rcp}_double to {rsqrt,rcp}d_decl 2014-02-05 13:05:04 +01:00
Evghenii
d3a6693eef adding __have_native_{rsqrtd,rcpd} to select between native support for double precision reciprocals and using slower but safe version in stdlib 2014-02-04 16:29:23 +01:00
Evghenii
fe98fe8cdc added fast approximate rcp(double) accurate to 15 digits 2014-02-04 15:23:34 +01:00
Evghenii
eb1a495a7a added support for fast approximate rsqrt(double). Provide 16 digit accurancy but is over 3x faster than 1/sqrt(double) 2014-02-04 14:44:54 +01:00
evghenii
3a72e05c3e +1 2014-02-02 18:16:48 +01:00
Vsevolod Livinskij
da02236b3a Scalar realization of no-vec functions was replaced from builtins to stdlib.ispc. 2014-01-20 16:06:34 +04:00
Vsevolod Livinskij
323587f10f Scalar implementation and implementation for targets which don't have h/w instructions 2014-01-02 16:48:56 +04:00
Vsevolod Livinskij
07c6f1714a Some fixes in function names and more tests was added. 2013-12-22 19:28:26 +04:00
Dmitry Babokin
d666fc3f8f Merge pull request #686 from ifilippov/ttt
packed_store_active2() - tuned version of packed_store_active()
2013-12-17 09:23:39 -08:00
Ilia Filippov
473f1cb4d2 packed_store_active2 2013-12-17 21:14:29 +04:00
Dmitry Babokin
6d51987e67 Merge pull request #642 from egaburov/launch3d
concept of 3d tasking
2013-12-17 08:40:07 -08:00
evghenii
c06ec92d0d added commas, added multi-dimensional tasking to mandelbrot_tasks & removed mandelbrot_task3d. Also adjusted documentaiton a bit 2013-12-13 11:49:11 +01:00
Vsevolod Livinskij
65768c20ae Added tests for saturation and some fixes for generic and avx target 2013-12-05 00:34:14 +04:00
Vsevolod Livinskij
4faff1a63c structural change 2013-11-30 10:48:18 +04:00
Vsevolod Livinskij
4c330bc38b Add code generation of saturation 2013-11-29 18:40:04 +04:00
Dmitry Babokin
6585a925be Merge pull request #641 from jbrodman/stdlibshift
Add a "shift" operator to the stdlib.
2013-10-28 14:18:31 -07:00
james.brodman
4d289b16c2 Redesign after being hit with the KISS bat. 2013-10-23 14:25:43 -04:00
egaburov
f89bad1e94 launch now passes the right info into tasking 2013-10-23 12:51:06 +02:00
james.brodman
f97a2d68c8 Bugfix for non-const shift amt and unit tests. 2013-10-22 18:29:20 -04:00
james.brodman
899f85ce9c Initial Support for new stdlib shift operator 2013-10-22 18:06:54 -04:00
Ilia Filippov
92773ada6d fix for ISPC for compfails at sse4-i8 and sse4-i16 2013-10-11 15:23:40 +04:00
egaburov
7364e06387 added mask64 2013-09-12 12:02:42 +02:00
egaburov
320c41ffcf added svml support. experimental. for some reason all sybmols are visible.. 2013-09-11 15:16:50 +02:00
Matt Pharr
5b20b06bd9 Add avg_{up,down}_int{8,16} routines to stdlib
These compute the average of two given values, rounding up and down,
respectively, if the result isn't exact.  When possible, these are
mapped to target-specific intrinsics (PADD[BW] on IA and VH[R]ADD[US]
on NEON.)

A subsequent commit will add pattern-matching to generate calls to
these intrinsincs when the corresponding patterns are detected in the
IR.)
2013-08-06 08:41:12 -07:00
Matt Pharr
48ff03112f Remove __pause from stdlib_core() in utils.m4.
It wasn't ever being used, and was breaking compilation on ARM.
2013-07-30 08:44:22 -07:00
Matt Pharr
ab3b633733 Add 8-bit and 16-bit specialized NEON targets.
Like SSE4-8 and SSE4-16, these use 8-bit and 16-bit values for mask
elements, respectively, and thus should generate the best code when used
for computation with datatypes of those sizes.
2013-07-30 08:44:16 -07:00
Matt Pharr
53414f12e6 Add SSE4 target optimized for computation with 8-bit datatypes.
This change adds a new 'sse4-8' target, where programCount is 16 and
the mask element size is 8-bits.  (i.e. the most appropriate sizing of
the mask for SIMD computation with 8-bit datatypes.)
2013-07-23 17:30:32 -07:00
Matt Pharr
15a3ef370a Use @llvm.readcyclecounter to implement stdlib clock() function.
Also added a test for the clock builtin.
2013-07-23 17:24:57 -07:00
Matt Pharr
e7abf3f2ea Add support for mask vectors of 8 and 16-bit element types.
There were a number of places throughout the system that assumed that the
execution mask would only have either 32-bit or 1-bit elements.  This
commit makes it possible to have a target with an 8- or 16-bit mask.
2013-07-23 16:50:11 -07:00
Dmitry Babokin
7bedb4a081 Add memory alignment dependant on the platform (16/32/64/etc) 2013-05-24 10:29:01 +04:00
Dmitry Babokin
630215f56f Defining memory routines completely separately for Windows/Unix 32/64 bit. 2013-05-24 10:29:01 +04:00
Dmitry Babokin
5362dade37 Fixing util.m4 to declare nothing unless some macro is instantiated 2013-05-24 10:29:00 +04:00
Dmitry Babokin
a47460b4c3 Efficient library implementation of broadcast 2013-05-02 00:12:16 +02:00
Dmitry Babokin
26bec62daf Removing duplicating free defintion on Linux 2013-04-27 00:29:51 +04:00
Dmitry Babokin
7497e86902 Adding Windows support for aligned memory allocation on Windows 2013-04-26 22:07:30 +02:00
Dmitry Babokin
95950885cf Use posix_memalign to allocate 16 byte alligned memeory on Linux/MacOS. 2013-04-26 20:33:24 +04:00
Dmitry Babokin
d36ab4cc3c Adding noalias attribute to malloc return 2013-04-25 20:39:01 +04:00
james.brodman
3aaf2ef2d4 ToT Fixes / M4 macro fix 2013-01-14 14:55:10 -05:00
Matt Pharr
765a0d8896 Use puts() rather than printf() for printing assertion failure strings.
This way, we don't lose '%'s in the assertion strings.

Issue #342.
2012-08-03 11:31:38 -07:00
Matt Pharr
6a410fc30e Emit gather instructions for the AVX2 targets.
Issue #308.
2012-07-13 12:29:05 -07:00
Matt Pharr
984a68c3a9 Rename gen_gather() macro to gen_gather_factored() 2012-07-13 12:24:12 -07:00
Matt Pharr
2c640f7e52 Add support for RDRAND in IvyBridge.
The standard library now provides a variety of rdrand() functions
that call out to RDRAND, when available.

Issue #263.
2012-07-12 06:07:07 -07:00
Matt Pharr
c09c87873e Whitespace / indentation fixes. 2012-07-11 14:29:46 -07:00