aaron/ispc - ispc - git.frat.tech

aaron/ispc

Author	SHA1	Message	Date
Vsevolod Livinskij	cef5b2eb04	Some changes in saturation arithmetic	2014-02-10 12:40:53 +04:00
Vsevolod Livinskij	1c1614d207	Some errors in comments and code were fixed	2014-02-09 21:39:42 +04:00
evghenii	09e8381ec7	change {rsqrt,rcp}_double to {rsqrt,rcp}d_decl	2014-02-05 13:05:04 +01:00
Evghenii	d3a6693eef	adding __have_native_{rsqrtd,rcpd} to select between native support for double precision reciprocals and using slower but safe version in stdlib	2014-02-04 16:29:23 +01:00
Evghenii	fe98fe8cdc	added fast approximate rcp(double) accurate to 15 digits	2014-02-04 15:23:34 +01:00
Evghenii	eb1a495a7a	added support for fast approximate rsqrt(double). Provide 16 digit accurancy but is over 3x faster than 1/sqrt(double)	2014-02-04 14:44:54 +01:00
evghenii	3a72e05c3e	+1	2014-02-02 18:16:48 +01:00
Vsevolod Livinskij	da02236b3a	Scalar realization of no-vec functions was replaced from builtins to stdlib.ispc.	2014-01-20 16:06:34 +04:00
Vsevolod Livinskij	323587f10f	Scalar implementation and implementation for targets which don't have h/w instructions	2014-01-02 16:48:56 +04:00
Vsevolod Livinskij	07c6f1714a	Some fixes in function names and more tests was added.	2013-12-22 19:28:26 +04:00
Dmitry Babokin	d666fc3f8f	Merge pull request #686 from ifilippov/ttt packed_store_active2() - tuned version of packed_store_active()	2013-12-17 09:23:39 -08:00
Ilia Filippov	473f1cb4d2	packed_store_active2	2013-12-17 21:14:29 +04:00
Dmitry Babokin	6d51987e67	Merge pull request #642 from egaburov/launch3d concept of 3d tasking	2013-12-17 08:40:07 -08:00
evghenii	c06ec92d0d	added commas, added multi-dimensional tasking to mandelbrot_tasks & removed mandelbrot_task3d. Also adjusted documentaiton a bit	2013-12-13 11:49:11 +01:00
Vsevolod Livinskij	65768c20ae	Added tests for saturation and some fixes for generic and avx target	2013-12-05 00:34:14 +04:00
Vsevolod Livinskij	4faff1a63c	structural change	2013-11-30 10:48:18 +04:00
Vsevolod Livinskij	4c330bc38b	Add code generation of saturation	2013-11-29 18:40:04 +04:00
Dmitry Babokin	6585a925be	Merge pull request #641 from jbrodman/stdlibshift Add a "shift" operator to the stdlib.	2013-10-28 14:18:31 -07:00
james.brodman	4d289b16c2	Redesign after being hit with the KISS bat.	2013-10-23 14:25:43 -04:00
egaburov	f89bad1e94	launch now passes the right info into tasking	2013-10-23 12:51:06 +02:00
james.brodman	f97a2d68c8	Bugfix for non-const shift amt and unit tests.	2013-10-22 18:29:20 -04:00
james.brodman	899f85ce9c	Initial Support for new stdlib shift operator	2013-10-22 18:06:54 -04:00
Ilia Filippov	92773ada6d	fix for ISPC for compfails at sse4-i8 and sse4-i16	2013-10-11 15:23:40 +04:00
egaburov	7364e06387	added mask64	2013-09-12 12:02:42 +02:00
egaburov	320c41ffcf	added svml support. experimental. for some reason all sybmols are visible..	2013-09-11 15:16:50 +02:00
Matt Pharr	5b20b06bd9	Add avg_{up,down}_int{8,16} routines to stdlib These compute the average of two given values, rounding up and down, respectively, if the result isn't exact. When possible, these are mapped to target-specific intrinsics (PADD[BW] on IA and VH[R]ADD[US] on NEON.) A subsequent commit will add pattern-matching to generate calls to these intrinsincs when the corresponding patterns are detected in the IR.)	2013-08-06 08:41:12 -07:00
Matt Pharr	48ff03112f	Remove __pause from stdlib_core() in utils.m4. It wasn't ever being used, and was breaking compilation on ARM.	2013-07-30 08:44:22 -07:00
Matt Pharr	ab3b633733	Add 8-bit and 16-bit specialized NEON targets. Like SSE4-8 and SSE4-16, these use 8-bit and 16-bit values for mask elements, respectively, and thus should generate the best code when used for computation with datatypes of those sizes.	2013-07-30 08:44:16 -07:00
Matt Pharr	53414f12e6	Add SSE4 target optimized for computation with 8-bit datatypes. This change adds a new 'sse4-8' target, where programCount is 16 and the mask element size is 8-bits. (i.e. the most appropriate sizing of the mask for SIMD computation with 8-bit datatypes.)	2013-07-23 17:30:32 -07:00
Matt Pharr	15a3ef370a	Use @llvm.readcyclecounter to implement stdlib clock() function. Also added a test for the clock builtin.	2013-07-23 17:24:57 -07:00
Matt Pharr	e7abf3f2ea	Add support for mask vectors of 8 and 16-bit element types. There were a number of places throughout the system that assumed that the execution mask would only have either 32-bit or 1-bit elements. This commit makes it possible to have a target with an 8- or 16-bit mask.	2013-07-23 16:50:11 -07:00
Dmitry Babokin	7bedb4a081	Add memory alignment dependant on the platform (16/32/64/etc)	2013-05-24 10:29:01 +04:00
Dmitry Babokin	630215f56f	Defining memory routines completely separately for Windows/Unix 32/64 bit.	2013-05-24 10:29:01 +04:00
Dmitry Babokin	5362dade37	Fixing util.m4 to declare nothing unless some macro is instantiated	2013-05-24 10:29:00 +04:00
Dmitry Babokin	a47460b4c3	Efficient library implementation of broadcast	2013-05-02 00:12:16 +02:00
Dmitry Babokin	26bec62daf	Removing duplicating free defintion on Linux	2013-04-27 00:29:51 +04:00
Dmitry Babokin	7497e86902	Adding Windows support for aligned memory allocation on Windows	2013-04-26 22:07:30 +02:00
Dmitry Babokin	95950885cf	Use posix_memalign to allocate 16 byte alligned memeory on Linux/MacOS.	2013-04-26 20:33:24 +04:00
Dmitry Babokin	d36ab4cc3c	Adding noalias attribute to malloc return	2013-04-25 20:39:01 +04:00
james.brodman	3aaf2ef2d4	ToT Fixes / M4 macro fix	2013-01-14 14:55:10 -05:00
Matt Pharr	765a0d8896	Use puts() rather than printf() for printing assertion failure strings. This way, we don't lose '%'s in the assertion strings. Issue #342.	2012-08-03 11:31:38 -07:00
Matt Pharr	6a410fc30e	Emit gather instructions for the AVX2 targets. Issue #308.	2012-07-13 12:29:05 -07:00
Matt Pharr	984a68c3a9	Rename gen_gather() macro to gen_gather_factored()	2012-07-13 12:24:12 -07:00
Matt Pharr	2c640f7e52	Add support for RDRAND in IvyBridge. The standard library now provides a variety of rdrand() functions that call out to RDRAND, when available. Issue #263.	2012-07-12 06:07:07 -07:00
Matt Pharr	c09c87873e	Whitespace / indentation fixes.	2012-07-11 14:29:46 -07:00
Matt Pharr	10b79fb41b	Add support for non-factored variants of gather/scatter functions. We now have two ways of approaching gather/scatters with a common base pointer and with offset vectors. For targets with native gather/scatter, we just turn those into base + {1/2/4/8}offsets. For targets without, we turn those into base + {1/2/4/8}varying_offsets + const_offsets, where const_offsets is a compile-time constant. Infrastructure for issue #325.	2012-07-11 14:29:42 -07:00
Matt Pharr	ec0280be11	Rename gather/scatter_base_offsets functions to factored_based_offsets. No functional change; just preparation for having a path that doesn't factor the offsets into constant and varying parts, which will be better for AVX2 and KNC.	2012-07-11 14:16:39 -07:00
Matt Pharr	fb8b893b10	Fix incorrect LLVM_3_1svn tests. 1. For some time now, we provide the version without the 'svn' 2. We should be testing "not LLVM 3.0" in these cases, since they apply to LLVM 3.2 and beyond as well...	2012-07-09 07:09:25 -07:00
Matt Pharr	9ca80debb8	Remove stale LLVM 2.9 support from builtins/util.m4	2012-07-09 06:54:29 -07:00
Matt Pharr	d34a87404d	Provide (undocumented for now) __pause() call to emit PAUSE inst.	2012-06-28 09:28:25 -07:00

1 2

73 Commits