Remove memory_barrier() calls from atomics.

This was unnecessary overhead to impose on all callers; the user
should handle these as needed on their own.

Also added some explanatory text to the documentation that highlights
that memory_barrier() is only needed across HW threads/cores, not
across program instances in a gang.
This commit is contained in:
Matt Pharr
2012-04-10 19:37:03 -07:00
parent acfbe77ffc
commit 2aa61007c6
2 changed files with 7 additions and 27 deletions

View File

@@ -3880,6 +3880,11 @@ code.
void memory_barrier();
Note that this barrier is *not* needed for coordinating reads and writes
among the program instances in a gang; it's only needed for coordinating
between multiple hardware threads running on different cores. See the
section `Data Races Within a Gang`_ for the guarantees provided about
memory read/write ordering across a gang.
Prefetches
----------