Quantcast
Viewing all articles
Browse latest Browse all 4

Answer by Ken Birman for Are one-sided RDMA reads atomic for single cache lines?

Ok, meanwhile I seem to have found the correct answer, and I believe that Roland's response is not quite right -- partly right but not entirely.

In http://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-software-developer-manual-325462.pdf, which is the Intel architecture manual (I'll need to check again for AMD...) I found this: Atomic memory operation in Intel 64 and IA-32 architecture is guaranteed only for a subset of memory operandsizes and alignment scenarios. The list of guaranteed atomic operations are described in Section 8.1.1 of IA-32Intel® Architecture Software Developer’s Manual, Volumes 3A.

Then in that section, which is entitled MULTIPLE-PROCESSOR MANAGEMENT, one finds a lot of information about guaranteed atomic operations (page 2210). In particular, Intel guarantees that its memory subsystems will be atomic for native types (bit, byte, integers of various sizes, float). These objects must be aligned so as to fit within a cache line (64 bytes on the current Intel platforms), not crossing a cache line boundary. But then Intel guarantees that no matter what device is using the memory bus, stores and fetches will be atomic.

For more complex objects, locking is required if you want to be sure you will get a safe execution. Further, if you are doing multicore operations you have to use the locked (atomic) variants of the Intel instructions to be sure of coherency for concurrent writes. You get this automatically for variables marked volatile in C++ or C# (Java too?).

What this adds up to is that local writes to native types can be paired with remotely initiated RDMA reads safely.

But notice that strings, byte arrays -- those would not be atomic because they could easily cross a cache line. Also, operations on complex objects with more than one data field might not be atomic -- for such things you would need a more complex approach, such as the one in the FaRM paper (Fast Remote Memory) by MSR. My own need is simpler and won't require the elaborate version numbering scheme FaRM implements...


Viewing all articles
Browse latest Browse all 4

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>