TheJSR-133CookbookforCompilerWriters中英对照版翻译
由于本⼈能⼒有限,如有错误,烦请指出。
:gee.cs.oswego.edu/dl/jmm/cookbook.html
:yellowstar5/direct/The%20JSR-133%20Cookbook-chinese.html
Preface: Over the 10+ years since this was initially written, many processor and language memory model specifications and issues have become clearer and better understood. And many have not. While this guide is maintained to remain accurate, it is incomplete about some of these evolving details. For more extensive coverage, see especially the work of Peter Sewell and the
前⾔: ⾃此指南最初编写以来已有⼗多年,许多处理器和语⾔内存模型规范和问题已经变得更加清晰和更好地被理解。 然⽽许多还没有。尽管本指南⼀直被维护着以保持准确性,但关于⼀些不断发展的细节,此指南给出的内容并不完整。 有关更⼴泛的报道,尤其要参见 Peter Sewell 和 的⼯作。
This is an unofficial guide to implementing the new specified by . It provides at most brief backgrounds about why various rules exist, instead concentrating on their consequences for compilers and JVMs with respect to instruction reorderings, multiprocessor barrier instructions, and atomic operations. It inc
ludes a set of recommended recipes for complying to JSR-133. This guide is “unofficial” because it includes interpretations of particular processor properties and specifications. We cannot guarantee that the intepretations are correct. Also, processor specifications and implementations may change over time.
这是实现由 规范的新 的⾮官⽅指南。 它提供了有关为什么存在各种规则的最简短的背景,⽽不是专注于它们在指令重新排序,多处理器屏障指令和原⼦操作⽅⾯对编译器和JVM的影响。 它包括⼀组符合 JSR-133 的推荐⾷谱。 本指南是“⾮官⽅的”,因为它包含对特定处理器属性和规范的解释。 我们不能保证解释是正确的。 此外,处理器规范和实现可能会随时间⽽变化。
Reorderings (重排序)
accessible to sbFor a compiler writer, the JMM mainly consists of rules disallowing reorderings of certain instructions that access fields (where “fields” include array elements) as well as monitors (locks).
对于⼀个编译器编写者来说,JMM主要由禁⽌对访问字段(其中“字段”包括数组元素)和监视器(锁)的某些指令进⾏重排序的规则组成。
Volatiles and Monitors (volatile 和监视器)
The main JMM rules for volatiles and monitors can be viewed as a matrix with cells indicating that you cannot reorder instructions associated with particular sequences of bytecodes. This table is not itself the JMM specification; it is just a useful way of viewing its main consequences for compilers and runtime systems.
可以将 volatile 和监视器的主要 JMM 规则视为带有单元格的矩阵,其中单元格指⽰了你⽆法对与特定字节码序列相关的指令进⾏重排序。这个表格本⾝不是 JMM 规范; 它只是查看其对编译器和运⾏时系统的主要影响的⼀种有⽤⽅法。
Where:
Normal Loads are getfield, getstatic, array load of non-volatile fields.
Normal Stores are putfield, putstatic, array store of non-volatile fields
Volatile Loads are getfield, getstatic of volatile fields that are accessible by multiple threads
Volatile Stores are putfield, putstatic of volatile fields that are accessible by multiple threads
MonitorEnters (including entry to synchronized methods) are for lock objects accessible by multiple threads.
MonitorExits (including exit from synchronized methods) are for lock objects accessible by multiple threads.
其中:
普通加载(Normal Loads)是 ⾮volatile字段的 getfield,getstatic,数组加载
普通存储(Normal Stores)是 ⾮volatile字段的 putfield,putstatic,数组存储
Volatile加载(Volatile Loads)是 volatile字段(该字段被多线程访问)的 getfield,getstatic加载
Volatile存储(Volatile Stores)是 volatile字段(该字段被多线程访问)的 putfield,putstatic存储
MonitorEnters(包括同步⽅法的开始)⽤于可由多个线程访问的锁对象。
MonitorExits(包括同步⽅法的退出)⽤于可由多个线程访问的锁对象。
The cells for Normal Loads are the same as for Normal Stores, those for Volatile Loads are the same a
s MonitorEnter, and those for Volatile Stores are same as MonitorExit, so they are collapsed together here (but are expanded out as needed in subsequent tables). We consider here only variables that are readable and writable as an atomic unit – that is, no bit fields, unaligned accesses, or accesses larger than word sizes available on a platform.
Normal Loads 的单元格与 Normal Stores 的单元格相同, Volatile Loads 的单元格与 MonitorEnter 相同, ⽽ Volatile Stores 的单元格与 MonitorExit 相同,因此它们在此处折叠在⼀起(但根据需要在后续表格被展开)。 在这⾥,我们仅考虑以原⼦单位可读写的变量—— 即没有位字段,未对齐的访问或⼤于平台上可⽤字长的访问。
Any number of other operations might be present between the indicated 1st and 2nd operations in the table. So, for example, the “No” in cell [Normal Store, Volatile Store] says that a non-volatile store cannot be reordered with ANY subsequent volatile store; at least any that can make a difference in multithreaded program semantics.
表中指⽰的 1st 和 2nd 操作之间可能存在任意数量的其他操作。 因此,例如,[Normal Store, Volatile Store]单元格中的"No"表⽰, ⼀个 ⾮volatile存储 不能与任何后续的 voaltile存储 ⼀起重排序; ⾄少是任何在多线程程序语义上有影响的重排序。
The JSR-133 specification is worded such that the rules for both volatiles and monitors apply only to those that may be accessed by multiple threads. If a compiler can somehow (usually only with great effort) prove that a lock is only accessible from a single thread, it may be eliminated. Similarly, a volatile field provably accessible from only a single thread acts as a normal field. More fine-grained analyses and optimizations are also possible, for example, those relying on provable inaccessibility from multiple threads only during certain intervals.
JSR-133规范的措辞使得 volatile 和监视器的规则仅适⽤于可由多个线程访问的规则。 如果编译器可以⽤某种⽅式(通常要花费很⼤的精⼒)证明⼀个锁仅对单个线程可访问,那么该锁可能会被消除。 类似地,可证明仅对单个线程可访问的 volaitle 字段可以当成普通字段。更细粒度的分析和优化也是可能的,例如,那些依赖于仅在特定时间间隔内对多线程可证明不可访问的分析和优化。
Blank cells in the table mean that the reordering is allowed if the accesses aren’t otherwise dependent with respect to basic Java semantics (as specified in the ). For example even though the table doesn’t say so, you can’t reorder a load with a subsequent store to the same location. But you can reorder a load and store to two distinct locations, and may wish to do
so in the course of various compiler transformations and optimizations. This includes cases that aren’t
usually thought of as reorderings; for example reusing a computed value based on a loaded field rather than reloading and recomputing the value acts as a reordering. However, the JMM spec permits transformations that eliminate avoidable dependencies, and in turn allow reorderings.
表中的空⽩单元格表⽰,重排序是允许的,如果那些访问不依赖于基本的 Java 语义(如 所规范的)。 例如,即使表中没有这样说,你也不能将⼀个加载与⼀个后续对同⼀位置的存储重排序。 但是你可以将对两个不同位置的加载和存储重排序,并且可能希望在各种编译器转换和优化过程中这样做。 这包括通常不认为是重排序的情况; 例如,重⽤基于⼀个加载的字段得到的⼀个计算值,⽽不是重新加载并重新计算该值(与重排序⾏为⼀致)。 但是,JMM 规范允许进⾏转换,从⽽消除了可避免的依赖关系,进⽽允许重排序。
In all cases, permitted reorderings must maintain minimal Java safety properties even when accesses are incorrectly synchronized by programmers: All observed field values must be either the default zero/null “pre-construction” values, or those written by some thread. This usually entails zeroing all heap memory holding objects before it is used in constructors and never reordering other loads with the zeroing stores. A good way to do this is to zero out reclaimed memory within the garbage collector. See the JSR-133 spec for rules dealing with other corner cases surrounding safety guarantees.
在所有情况下,允许的重排序必须保持最⼩的 Java 安全属性,即使当那些访问被程序员不正确地同步的时候: 所有观察到的字段值都必须是默认的 zero/null "pre-construction"值,或者是由某个线程写⼊的值。 这通常需要在构造函数使⽤堆内存之前将持有对象的所有堆内存清零,还需要永远不会将零存储(zeroing stores)与其他存储重排序。 实现上述要求的⼀个好⽅法是将垃圾回收器中回收的内存清零。
处理围绕安全保证的其他特殊情况的相关规则,请参见 JSR-133 规范。
The rules and properties described here are for accesses to Java-level fields. In practice, these will additionally interact with accesses to internal bookkeeping fields and data, for example object headers, GC tables, and dynamically generated code.
此处描述的规则和属性⽤于访问 Java-level 的字段。 实际上,它们还将与对内部记录字段和数据(例如对象头,GC表和动态⽣成的代码)的访问进⾏交互。
Final Fields (final字段)
Loads and Stores of final fields act as “normal” accesses with respect to locks and volatiles, but impose two additional reordering rules:
final字段的加载和存储就锁和volatile⽽⾔是“普通”访问,但是强加了两个附加的重排序规则:
1. A store of a final field (inside a constructor) and, if the field is a reference, any store that this final can reference,
cannot be reordered with a subsequent store (outside that constructor) of the reference to the object holding that field into a variable accessible to other threads. For example, you cannot reorder
x.finalField = v; … ; sharedRef = x;
This comes into play for example when inlining constructors, where “…” spans the logical end of the constructor.
You cannot move stores of finals within constructors down below a store outside of the constructor that might make the object visible to other threads. (As seen below, this may also require issuing a barrier). Similarly, you cannot
reorder either of the first two with the third assignment in:
v.afield = 1; x.finalField = v; … ; sharedRef = x;
2. The initial load (i.e., the very first encounter by a thread) of a final field cannot be reordered with the initial load of the
reference to the object containing the final field. This comes into play in:
x = sharedRef; … ; i = x.finalField;
A compiler would never reorder these since they are dependent, but there can be consequences of this rule on some
processors.
3. ⼀个构造函数内部的 final 字段存储,并且如果该 final 字段是⼀个引⽤,则此 final 字段可以引⽤的任何存储都不能与对该对象(该
对象将final字段保存到其他线程可访问的变量中)的引⽤的后续存储重排序。 例如,你不能重排序
x.finalField = v; … ; sharedRef = x;
这会起作⽤,例如当内联构造函数时,其中 “…” 跨越构造函数的逻辑端。 你不能将构造函数中的 finals 存储向下移动到构造函数之外的存储,该构造函数之外的存储可能会使该对象对其他线程可见 (如下所⽰,这可能还需要调⽤⼀个屏障指令)。 同样,你不能在以下语句中将前两个中的任⼀个赋值与第三个重排序:
v.afield = 1; x.finalField = v; … ; sharedRef = x;
4. final字段的初始加载(即线程最先遇到的加载)不能与对包含final字段的对象的引⽤的初始加载重排序。 这在以下语句起作⽤:
x = sharedRef; … ; i = x.finalField;
编译器永远不会对它们进⾏排序,因为它们是依赖的,但是此规则可能会对某些处理器造成影响。
These rules imply that reliable use of final fields by Java programmers requires that the load of a shared reference to an object with a final field itself be synchronized, volatile, or final, or derived from such a load, thus ultimately ordering the initializing stores in constructors with subsequent uses outside constructors.
这些规则暗⽰: Java 程序员对 final 字段的可靠使⽤存在要求, 该要求是对带有 final 字段的对象的共享引⽤的加载本⾝必须是synchronized,volatile 或 final,或者是从此类加载派⽣来的, 因⽽最终将构造函数中的初始化存储与构造函数外的后续使⽤排序。
Memory Barriers (内存屏障)
Compilers and processors must both obey reordering rules. No particular effort is required to ensure that uniprocessors maintain proper ordering, since they all guarantee “as-if-sequential” consistency. But on multiprocessors, guaranteeing conformance often requires emitting barrier instructions. Even if a compiler optimizes away a field access (for example because a loaded value is not used), barriers must still be generated as if the access were still present. (Although see below about independently optimizing away barriers.)
编译器和处理器都必须遵守重排序规则。 不需要特别的努⼒来确保单处理器保持适当的排序,因为它们都保证 “as-if-sequential” ⼀致性。 但是在多处理器上,要保证⼀致性,通常需要调⽤屏障指令。 即使编译器优化掉了⼀个字段访问(例如,因为⼀个加载的值未被使⽤),屏障也必须仍然被⽣成,就像访问仍然存在⼀样。 (但是可参阅下⾯有关独⽴地优化掉屏障的信息。)
Memory barriers are only indirectly related to higher-level notions described in memory models such as “acquire” and “release”. And memory barriers are not themselves “synchronization barriers”. And memory barriers are unrelated to the kinds of “write barriers” used in some garbage collectors. Memory barrier instructions directly control only the interaction of a CPU with its cache, with its write-buffer that holds stores waiting to be flushed to memory, and/or its buffer of waiting loads or speculatively executed instructions. These effects may lead to further interaction among caches, main
memory and other processors. But there is nothing in the JMM that mandates any particular form of communication across processors so long as stores eventually become globally performed; i.e., visible across all processors, and that loads retrieve them when they are visible.
内存屏障仅与内存模型中描述的更⾼级概念(例如 “acquire” 和 “release”)间接相关。 并且内存屏障本⾝并不是"同步屏障" (“synchronization barriers”)。 并且内存屏障与某些垃圾收集器中使⽤的"写屏障"(“write barriers”)的种类⽆关。 内存屏障指令仅直接控制 CPU 与该 CPU 的⾼速缓存,该 CPU的的写⼊缓冲区(保存等待刷新到主存的存储),和/或该 CPU 的等待加载的缓冲区或推测执⾏的指令的交互。 这些影响可能导致多个⾼速缓存,主存和其他多个处理器之间的进⼀步交互。 但是,只要存储最终在全局执
⾏,JMM 中就没有什么要求在处理器之间进⾏任何特定形式的通信; 即在所有处理器上均可见,并且在可见时加载会获取它们。Categories (⽬录)
Nearly all processors support at least a coarse-grained barrier instruction, often just called a Fence, that guarantees that all loads and stores initiated before the fence will be strictly ordered before any load or store initiated after the fence. This is usually among the most time-consuming instructions on any given processor (often nearly as, or even more expensive than atomic instructions). Most processors additionally support more fine-grained barriers.
⼏乎所有处理器都⾄少⽀持⼀个粗粒度的屏障指令,通常称为⼀个栅栏(Fence), 该栅栏可确保在该栅栏之前的所有加载和存储都会被严格排序在在该栅栏之后的任何加载或存储之前。 这通常是在任何给定处理器上最耗时的指令之⼀(通常与原⼦指令⼏乎⼀样,甚⾄⽐原⼦指令更昂贵)。 ⼤多数处理器还⽀持更多细粒度的屏障
A property of memory barriers that takes some getting used to is that they apply BETWEEN memory accesses. Despite the names given for barrier instructions on some processors, the right/best barrier to use depends on the kinds of accesses it separates. Here’s a common categorization of barrier types that maps pretty well to specific instructions (sometimes no-ops) on existing processors:
内存屏障的⼀项属性(该属性需要⼀些时间来习惯),它们会应⽤在内存访问之间。 尽管在某些处理器上为屏障指令指定了名称,但要使⽤的正确/最佳的屏障取决于它分隔的访问类型。 下⾯是屏障类型的⼀个常见分类,该分类可以很好地映射到现有处理器上的特定指令(有时是no-ops):
LoadLoad Barriers
The sequence: Load1; LoadLoad; Load2
ensures that Load1’s data are loaded before data accessed by Load2 and all subsequent load instructi
ons are loaded. In general, explicit LoadLoad barriers are needed on processors that perform speculative loads and/or out-of-order processing in which waiting load instructions can bypass waiting stores. On processors that guarantee to always preserve load ordering, the barriers amount to no-ops.
序列: Load1; LoadLoad; Load2
确保在加载由 Load2 和所有后续加载指令访问的数据之前,先加载 Load1 的数据。 通常,显式的 LoadLoad 屏障在这样的处理器上被需要,该处理器执⾏推测性加载和/或乱序处理(其中等待中的加载指令可以绕过等待中的存储)。 在保证始终保持加载排序的处理器上,屏障等于no-ops。
StoreStore Barriers
The sequence: Store1; StoreStore; Store2
ensures that Store1’s data are visible to other processors (i.e., flushed to memory) before the data associated with Store2 and all subsequent store instructions. In general, StoreStore barriers are needed on processors that do not otherwise guarantee strict ordering of flushes from write buffers and/or caches to other processors or main memory.
序列: Store1; StoreStore; Store2
确保在与 Store2 和所有后续存储指令关联的数据之前,Store1 的数据对其他处理器可见(即已刷新到内存)。 通常,StoreStore 屏障在这样的处理器上被需要,该处理器否则不能保证从写缓冲区和/或⾼速缓存到其他处理器或主存储器的刷新严格排序。
LoadStore Barriers
The sequence: Load1; LoadStore; Store2
ensures that Load1’s data are loaded before all data associated with Store2 and subsequent store instructions are flushed. LoadStore barriers are needed only on those out-of-order procesors in which waiting store instructions can bypass loads.
序列: Load1; LoadStore; Store2
确保在与 Store2 和后续存储指令相关的所有数据被刷新之前,Load1 的数据先被加载。 仅在那些等待中的存储指令可以绕过加载的乱序处理器上才需要 LoadStore 屏障
StoreLoad Barriers
The sequence: Store1; StoreLoad; Load2
ensures that Store1’s data are made visible to other processors (i.e., flushed to main memory) before data accessed by Load2 and all subsequent load instructions are loaded. StoreLoad barriers protect against a subsequent load incorrectly using Store1’s data value rather than that from a more recent store to the same location performed by a different processor. Because of this, on the processors discussed below, a StoreLoad is strictly necessary only for separating stores from subsequent loads of the same location(s) as were stored before the barrier. StoreLoad barriers are needed on nearly all recent multiprocessors, and are usually the most expensive kind. Part of the reason they are expensive is that they must disable mechanisms that ordinarily bypass cache to satisfy loads from write-buffers. This might be implemented by letting the buffer fully flush, among other possible stalls.
序列: Store1; StoreLoad; Load2
确保在加载 Load2 和所有后续加载指令所访问的数据之前,使 Store1 的数据对其他处理器可见(即已刷新到主存储器)。 StoreLoad 屏障可以防⽌⼀个后续加载错误地使⽤ Store1 的数据值,⽽不是使⽤由不同处理器执⾏的对相同位置的更新的存储。 因此,在下⾯讨论的处理器上,只有为了将存储与该屏障之后访问同⼀位置的后续加载分开时,⼀个 StoreLoad 才严格需要。 StoreLoad 屏障在⼏乎所有最新的多处理器中都是必需的,并且通常是最昂贵的屏障。 它们之所以昂贵的部分原因是它们必须禁⽤通常绕过⾼速缓存的机制来满⾜来⾃写缓冲区的加载。 这可以通过让缓冲区完全刷新以及其他可能的停顿来
实现。
On all processors discussed below, it turns out that instructions that perform StoreLoad also obtain the other three barrier effects, so StoreLoad can serve as a general-purpose (but usually expensive) Fence. (This is an empirical fact, not a necessity.) The opposite doesn’t hold though. It is NOT usually the case that issuing any combination of other barriers gives the equivalent of a StoreLoad.
在下⾯讨论的所有处理器上,事实证明执⾏ StoreLoad 的指令也获得了其他三种屏障效果, 因此 StoreLoad 可⽤作通⽤(但通常很贵)的Fence。 (这是⼀个经验事实,不是必须的。) 相反情况并不成⽴。 调⽤其他屏障的任意组合相当于 StoreLoad, 通常 不 是这种情况。
The following table shows how these barriers correspond to JSR-133 ordering rules.
下表显⽰了这些屏障如何与 JSR-133 排序规则相对应。
Plus the special final-field rule requiring a StoreStore barrier in
x.finalField = v; StoreStore ; sharedRef = x;
Here’s an example showing placements.
加上特殊的final字段规则,该规则要求在下⾯语句中需要⼀个 StoreStore 屏障
x.finalField = v; StoreStore ; sharedRef = x;
下⾯是显⽰展⽰位置的⽰例。
Data Dependency and Barriers (数据依赖性和屏障)
The need for LoadLoad and LoadStore barriers on some processors interacts with their ordering guarantees for dependent instructions. On some (most) processors, a load or store that is dependent on the value of a previous load are ordered by the processor without need for an explicit barrier. This commonly arises in two kinds of cases, indirection:
Load x; Load x.field
and control
Load x;if(predicate(x)) Load or Store y;
版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系QQ:729038198,我们将在24小时内删除。
发表评论