Deeply analyze the memory semantic implementation and application scenarios of volatile in Java
Implementation of volatile memory semantics
Next, let's take a look at how JMM implements the memory semantics of volatile write / read.
We mentioned earlier that over reordering is divided into compiler reordering and processor reordering. In order to implement volatile memory semantics, JMM limits the reordering types of these two types respectively. The following is the volatile reordering rule table developed by JMM for the compiler:
For example, the last cell in the third row means that in the program order, when the first operation is the reading or writing of ordinary variables, if the second operation is volatile writing, the compiler cannot reorder the two operations.
As can be seen from the above table:
When the second operation is volatile write, no matter what the first operation is, it cannot be reordered. This rule ensures that operations before volatile writes are not reordered by the compiler after volatile writes. When the first operation is a volatile read, no matter what the second operation is, it cannot be reordered. This rule ensures that operations after volatile reads are not reordered by the compiler before volatile reads. You cannot reorder when the first operation is volatile write and the second operation is volatile read. In order to realize the memory semantics of volatile, when generating bytecode, the compiler will insert a memory barrier in the instruction sequence to prohibit specific types of processor reordering. For the compiler, it is almost impossible to find an optimal arrangement to minimize the total number of insertion barriers. Therefore, JMM adopts a conservative strategy. The following is a JMM memory barrier insertion strategy based on a conservative strategy:
The above memory barrier insertion strategy is very conservative, but it can ensure that the correct volatile memory semantics can be obtained in any processor platform and any program.
The following is the schematic diagram of the instruction sequence generated after volatile write is inserted into the memory barrier under the conservative strategy:
The storestore barrier in the figure above can ensure that all normal write operations in front of volatile are visible to any processor before writing. This is because the storestore barrier will ensure that all normal writes above are flushed to main memory before volatile writes.
The interesting thing here is the storeload barrier behind volatile. The purpose of this barrier is to avoid the reordering of volatile writes and subsequent volatile read / write operations. Because compilers often can't accurately judge what is written after a volatile, Do you need to insert a storeload barrier (for example, the method returns immediately after a volatile is written). In order to ensure that the memory semantics of volatile can be correctly implemented, JMM adopts a conservative strategy here: insert a storeload barrier after each volatile is written or before each volatile is read. From the perspective of overall execution efficiency, JMM chooses to insert a storeload barrier after each volatile is written. Because the common usage pattern of volatile write read memory semantics is that one write thread writes a volatile variable and multiple read threads read the same volatile variable. When the number of read threads greatly exceeds that of write threads, choosing to insert the storeload barrier after volatile write will bring considerable improvement in execution efficiency. From here, we can see a feature of JMM in implementation: first ensure correctness, and then pursue execution efficiency.
The following is a schematic diagram of the instruction sequence generated after volatile read is inserted into the memory barrier under the conservative strategy:
The loadload barrier in the figure above is used to prevent the processor from reordering the volatile reads above with the normal reads below. The loadstore barrier is used to prevent the processor from reordering the volatile reads above and the normal writes below.
The memory barrier insertion strategy of volatile write and volatile read is very conservative. In actual execution, the compiler can omit unnecessary barriers according to specific situations as long as the memory semantics of volatile write read are not changed. Let's illustrate it with specific example code:
For readandwrite() method, the compiler can optimize bytecode generation as follows:
Note that the last storeload barrier cannot be omitted. Because after the second volatile is written, the method returns immediately. At this time, the compiler may not be able to accurately determine whether there will be volatile read or write. For security reasons, the compiler often inserts a storeload barrier here.
The above optimization is for any processor platform. Since different processors have different processor memory models with different "tightness", the insertion of memory barrier can continue to be optimized according to the specific processor memory model. Take x86 processor as an example. In the figure above, except for the last storeload barrier, other barriers will be omitted.
Volatile read and write under the previous conservative strategy can be optimized as follows on the x86 processor platform:
As mentioned earlier, x86 processors only reorder write read operations. X86 does not reorder read-read, read-write and write write operations, so the memory barrier corresponding to these three operation types will be omitted in x86 processors. In x86, JMM only needs to insert a storeload barrier after volatile write to correctly realize the memory semantics of volatile write read. This means that in x86 processors, volatile writes are much more expensive than volatile reads (because the barrier overhead of executing storeload is relatively large).
Why should jsr-133 enhance the memory semantics of volatile
In the old JAVA memory model before jsr-133, although reordering between volatile variables is not allowed, the old JAVA memory model allows reordering between volatile variables and ordinary variables. In the old memory model, the volatileexample sample program may be reordered to execute in the following timing:
In the old memory model, when there is no data dependency between 1 and 2, 1 and 2 may be reordered (similar to 3 and 4). The result is that when read thread B executes 4, it may not be able to see the changes of shared variables made by write thread a when executing 1.
Therefore, in the old memory model, volatile write read does not have the memory semantics of monitor release acquisition. In order to provide a more lightweight mechanism for inter thread communication than monitor lock, jsr-133 expert group decided to enhance volatile memory semantics: strictly restrict the reordering of volatile variables and ordinary variables by compiler and processor, and ensure that volatile write read has the same memory semantics as monitor release obtain. From the perspective of compiler reordering rules and processor memory barrier insertion strategy, as long as the reordering between volatile variables and ordinary variables may destroy the memory semantics of volatile, this reordering will be prohibited by compiler reordering rules and processor memory barrier insertion strategy.
Since volatile only guarantees the atomicity of the read / write of a single volatile variable, the mutually exclusive execution of the monitor lock can ensure the atomicity of the execution of the code in the whole critical area. In terms of function, the monitor lock is more powerful than volatile; Volatile has more advantages in scalability and execution performance. If the reader wants to use volatile instead of monitor lock in the program, please be careful.
Scenario using volatile keyword
The synchronized keyword prevents multiple threads from executing a piece of code at the same time, which will greatly affect the program execution efficiency. In some cases, the volatile keyword has better performance than the synchronized keyword, but it should be noted that the volatile keyword cannot replace the synchronized keyword, because the volatile keyword cannot guarantee the atomicity of the operation. Generally speaking, using volatile requires the following two conditions:
1) the write operation to the variable does not depend on the current value
2) the variable is not included in an invariant with other variables
In fact, these conditions indicate that these valid values that can be written to volatile variables are independent of the state of any program, including the current state of the variable.
In fact, my understanding is that the above two conditions need to ensure that the operation is atomic in order to ensure that the program using volatile keyword can execute correctly during concurrency.
Here are some scenarios for using volatile in Java.
1. Status mark quantity
2.double check