Explain the volatile keyword that Java interviewers love to ask
The main content of this article is a common knowledge point in Java interview: volatile keyword. This paper introduces all aspects of volatile keyword in detail. I hope you can perfectly solve the problems related to volatile keyword after reading this article.
In job interviews related to Java, many interviewers like to investigate the interviewer's understanding of Java concurrency. Taking volatile keyword as a small entry point, you can often ask to the end, The Java Memory Model (JMM) and some features of Java Concurrent Programming are involved. In depth, you can also investigate the underlying implementation of the JVM and the relevant knowledge of the operating system. Let's take a hypothetical interview process to have an in-depth understanding of the volatile keyword!
Interviewer: how about Java concurrency? Tell me about your understanding of volatile keyword
As far as I understand, shared variables modified by volatile have the following two characteristics:
1. Ensure the memory visibility of different threads to the variable operation; 2. Prohibit instruction reordering
Interviewer: can you elaborate on what is memory visibility and what is reordering?
There's a lot to talk about. I'd better start with the JAVA memory model. The Java virtual machine specification attempts to define a JAVA memory model (JMM) to shield the memory access differences of various hardware and operating systems, so that Java programs can achieve consistent memory access effects on various platforms. In short, because the CPU executes instructions very fast, but the memory access speed is much slower, the difference is not an order of magnitude, so the big guys who engage in processors have added several layers to the CPU Cache. In the JAVA memory model, the above optimization is abstracted again. JMM stipulates that all variables are stored in main memory, which is similar to the ordinary memory mentioned above. Each thread also contains its own working memory. It can be regarded as a register or cache on the CPU for easy understanding. Therefore, the operation of threads is mainly based on working memory. They can only access their own working memory, and the values should be synchronized back to the main memory before and after work. I'm a little confused. Take a piece of paper and draw:
When the thread executes, it will first read the variable value from the main memory, then load it into the copy in the working memory, and then transfer it to the processor for execution. After execution, it will assign a value to the copy in the working memory, and then the working memory will transfer the value back to the main memory, and the value in the main memory will be updated. Using working memory and main memory, although it speeds up the speed, it also brings some problems. For example, take the following example:
Suppose the initial value of I is 0. When only one thread executes it, the result must be 1. When two threads execute, will the result be 2? Not necessarily. This may be the case:
If two threads follow the above execution process, the final value of I is 1. If the last write back takes effect slowly and you read the value of I, it may be 0. This is the problem of cache inconsistency. The following is the question you just asked. JMM is mainly built around the three characteristics of how to deal with atomicity, visibility and ordering in the concurrency process. By solving these three problems, the problem of cache inconsistency can be solved. Volatile is about visibility and order.
Interviewer: what about these three characteristics?
1. Atomicity: in Java, the reading and assignment of basic data types are atomic operations. The so-called atomic operations mean that these operations are non interruptible and must be completed or not executed. For example:
Among the above four operations, I = 2 is a read operation, which must be an atomic operation. J = I you think it is an atomic operation. In fact, it is divided into two steps. One is to read the value of I and then assign it to J. this is a two-step operation, which is not called an atomic operation. I + + and I = I + 1 are actually equivalent. Read the value of I, add 1, and then write it back to main memory, which is a three-step operation. Therefore, in the above example, the final value may occur in many cases because it cannot meet atomicity. In this way, there is only a simple read. Assignment is an atomic operation, and it can only be assigned with numbers. If you use variables, there is one more step to read the value of variables. One exception is that the virtual machine specification allows two 32-bit operations for 64 bit data types (long and double), but the latest JDK implementation still implements atomic operations. JMM only realizes the basic atomicity. For operations like I + + above, synchronized and lock must be used to ensure the atomicity of the whole block of code. Before releasing the lock, the thread must brush the value of I back to the main memory. 2. Visibility: when it comes to visibility, Java uses volatile to provide visibility. When a variable is modified by volatile, its modification will be immediately refreshed to main memory. When other threads need to read the variable, they will read the new value in memory. Ordinary variables cannot guarantee this. In fact, visibility can also be guaranteed through synchronized and lock. Threads will brush the shared variable values back to main memory before releasing the lock, but synchronized and lock are more expensive. 3. Ordering JMM allows the compiler and processor to reorder instructions, but specifies the as if serial semantics, that is, no matter how reordered, the execution result of the program cannot be changed. For example, the following program segments:
The above statements can be executed according to a - > b - > C, and the result is 3.14. However, they can also be executed in the order of B - > A - > C. because a and B are two independent statements, and C depends on a and B, a and B can be reordered, but C cannot be placed in front of a and B. JMM ensures that reordering will not affect the execution of a single thread, but it is prone to problems in multiple threads. For example, this Code:
If two threads execute the above code segment, thread 1 executes write first, then thread 2 executes multiply, and finally the value of RET must be 4? The results are not necessarily:
As shown in the figure, 1 and 2 in the write method are reordered. Thread 1 first assigns the flag to true, then executes to thread 2, RET directly calculates the result, and then to thread 1. At this time, a is assigned to 2, which is obviously a step late. At this time, the volatile keyword can be added to the flag to prohibit reordering, which can ensure the order of the program, or the heavyweight synchronized and lock can ensure the order, which can ensure that the code in that area is executed at one time. In addition, JMM has some inherent order, that is, it can be guaranteed without any means, which is usually called happens before principle<< Jsr-133: Java Memory Model and thread specification > > defines the following happens before rules: 1 Program sequence rule: for each operation in a thread, happens before is applied to any subsequent operation in the thread. 2 Monitor lock rule: unlock a thread and happens before locking the thread 3 Volatile variable rule: writes to a volatile field happen before subsequent reads to the volatile field 4 Transitivity: if a happens before B and B happens before C, then a happens before C 5 Start() rule: if thread a performs the operation threadb_ Start() (start thread b), then the threadb of thread a_ Start() happens before any operation in B 6 Join() principle: if a executes threadb Join() and return successfully, then any operation in thread B happens before thread a from threadb The join() operation returned successfully. 7. Interrupt () principle: the call to thread interrupt () method occurs first when the interrupted thread code detects the occurrence of an interrupt event. You can use thread The interrupted () method detects whether an interrupt occurs 8 Finalize () principle: the initialization of an object occurs first at the beginning of its finalize () method. Rule 1: program sequence rule means that in a thread, all operations are in order, but in JMM, reordering is allowed as long as the execution results are the same. The happens before here emphasizes the correctness of the execution results of a single thread, But there is no guarantee that multithreading is the same. Rule 2: the monitor rule is actually easy to understand, that is, before adding a lock, you can continue to add a lock only after you confirm that the lock has been released before. Rule 3 applies to the volatile in question. If a thread writes a variable first and another thread reads it, the write operation must precede the read operation. The fourth rule is the transitivity of happens before. The following articles will not be repeated one by one.
Interviewer: how does volatile keyword meet the three characteristics of concurrent programming?
It is necessary to recall the volatile variable rule: writes to a volatile field happen before subsequent reads to the volatile field. In fact, if a variable is declared to be volatile, I can always read its latest value when I read the variable. Here, the latest value means that no matter which other thread writes to the variable, it will be immediately updated to main memory, and I can also read the newly written value from main memory. That is, volatile keyword can ensure visibility and order. Continue to take the above code as an example:
This code is not only troubled by reordering, even if 1 and 2 are not reordered. 3 will not be implemented so smoothly. Assuming that thread 1 executes the write operation first and thread 2 executes the multiply operation, since thread 1 assigns the flag to 1 in the working memory and does not necessarily write it back to main memory immediately, when thread 2 executes, multiply reads the flag value from main memory, which may still be false, then the statement in parentheses will not be executed. If it is changed to the following:
Then, thread 1 executes write first, and thread 2 executes multiply. According to the happens before principle, this process will meet the following three types of rules: program sequence rules: 1 happens before 2; 3 happens-before 4; (volatile restricts instruction reordering, so 1 executes before 2) volatile rules: 2 happens before 3 transitivity rules: 1 happens before 4 when writing a volatile variable, JMM will flush the shared variable in the local memory corresponding to the thread to the main memory. When reading a volatile variable, JMM will set the local memory corresponding to the thread to be invalid, The thread will then read the shared variable from main memory.
Interviewer: volatile's two-point memory semantics can ensure visibility and order, but can it ensure atomicity?
First of all, my answer is that atomicity cannot be guaranteed. If it can be guaranteed, it is only atomic for the read / write of a single volatile variable, but there is nothing to do with composite operations such as volatile + +, such as the following example:
In principle, the result is 10000, but it is likely to be a value less than 10000 in operation. Some people may say that volatile does not guarantee visibility. One thread should see the modification of Inc immediately by another thread! However, the operation Inc + + here is a composite operation, including reading the value of Inc, increasing it automatically, and then writing it back to main memory. Suppose that thread a reads the value of Inc as 10. At this time, it is blocked because the variable is not modified and the volatile rule cannot be triggered. Thread B also reads the value of Inc at this time. The value of Inc in main memory is still 10, which is self incremented, and then immediately written back to main memory, which is 11. At this time, it's thread a's turn to execute. Since 10 is saved in the working memory, continue to self increment, write it back to the main memory, and 11 is written again. Therefore, although the two threads execute increase () twice, the result is added only once. Some people say that volatile will invalidate the cache line? However, before thread a reads that thread B also operates, thread a does not modify the Inc value, so thread B still reads 10 when reading. Others say that thread B will write 11 back to main memory, and will not set the cache line of thread a as invalid? However, thread a has already done the read operation. Only when it finds that its cache line is invalid during the read operation will it read the value of main memory, so thread a can only continue to increase itself. To sum up, in this scenario of compound operation, the atomic function cannot be maintained. However, volatile in the above example of setting the flag value, the read / write operation of the flag is single-step, so it can still ensure atomicity. To ensure atomicity, you can only use synchronized, lock and atomic atomic operation classes under concurrent packages, that is, encapsulate the self increment (plus 1 operation), self decrement (minus 1 operation), addition (plus a number) and subtraction (minus a number) of basic data types to ensure that these operations are atomic operations.
Interviewer: that's OK. Do you know the underlying implementation mechanism of volatile?
If the code with volatile keyword and the code without volatile keyword are generated into assembly code, it will be found that the code with volatile keyword will have an additional lock prefix instruction. The lock prefix instruction is actually equivalent to a memory barrier, which provides the following functions: 1 During reordering, the following instructions cannot be reordered to the position before the memory barrier 2 Make the cache of this CPU write to memory * * 3 The write action will also cause other CPUs or other cores to invalidate their caches, which is equivalent to making the newly written value visible to other threads.
Interviewer: where do you use volatile, for example?
1. Status quantity flag, just like the flag flag above, let me mention again:
This read-write operation of variables, marked volatile, can ensure that the changes are immediately visible to the thread. Compared with synchronized, lock has a certain efficiency improvement. 2. Implementation of singleton mode, typical double check locking (DCL)
This is a lazy singleton mode. Objects are created only when they are used. In order to avoid reordering the instructions of initialization operations, volatile is added to instance.
summary
The above is all about the volatile keyword that Java interviewers like to ask in detail. I hope it will be helpful to you. Interested friends can continue to refer to other related topics on this site. If there are deficiencies, please leave a message to point out. Thank you for your support!