Detailed explanation of synchronized implementation principle in Java
I remember when I first started learning Java, when I encountered multithreading, it was synchronized. Compared with us at that time, synchronized was so magical and powerful. At that time, we gave it a name "synchronization", which has also become a good medicine for us to solve multithreading. However, as we learn, we know that synchronized is a heavyweight lock. Compared with lock, it will appear so bulky that we think it is not so efficient and abandon it slowly. Indeed, with the various optimizations of synchronized by javs se 1.6, synchronized will not appear so heavy. Let's follow LZ to explore the implementation mechanism of synchronized, how Java optimizes it, lock optimization mechanism, lock storage structure and upgrade process;
Implementation principle
Synchronized can ensure that only one method can enter the critical area at the same time when the method or code block is running. At the same time, it can also ensure the memory visibility of shared variables
Every object in Java can be used as a lock, which is the basis for synchronized synchronization:
A lock is a static synchronization method of the current instance object. A lock is a class object synchronization method block of the current class. A lock is an object in parentheses. When a thread accesses a synchronization code block, it first needs to get the lock to execute the synchronization code. When it exits or throws an exception, it must release the lock. How does it implement this mechanism? Let's start with a simple code:
Use the javap tool to view the generated class file information to analyze the implementation of synchronize
As can be seen from the above, the synchronization code block is implemented using the monitorenter and monitorexit instructions. The synchronization method (which can't be seen here, it needs to see the underlying implementation of the JVM) depends on the acc_synchronized implementation on the method modifier.
Synchronous code block: the monitorenter instruction is inserted at the beginning of the synchronous code block, and the monitorexit instruction is inserted at the end of the synchronous code block. The JVM needs to ensure that each monitorenter has a monitorexit corresponding to it. Any object has a monitor associated with it. When a monitor is held, it will be locked. When the thread executes the monitorenter instruction, it will try to obtain the monitor ownership corresponding to the object, that is, try to obtain the lock of the object; Synchronous method: the synchronized method will be translated into ordinary method call and return instructions, such as invokevirtual and areturn instructions. There is no special instruction to implement the synchronized modified method at the VM bytecode level, but the access of the method in the method table of the class file_ The synchronized flag position 1 in the flags field indicates that the method is a synchronous method, and the object calling the method or the class to which the method belongs is used in the internal object of the JVM to indicate Klass as the lock object (Reference: https://www.oudahe.com/p/42946/ )
Let's continue the analysis, but before going deeper, we need to understand two important concepts: Java object header and monitor.
Java object header, monitor
Java object header and monitor are the basis for realizing synchronized! These two concepts are introduced in detail below.
Java object header
The lock used for synchronized is stored in the Java object header, so what is the Java object header? The object header of hotspot virtual machine mainly includes two parts of data: mark word and Klass pointer (type pointer). Klass point is the pointer of the object to its class metadata. The virtual machine uses this pointer to determine which class instance the object is. Mark word is used to store the runtime data of the object itself. It is the key to realize lightweight lock and bias lock, so it will be described in detail below
Mark Word。
Mark word is used to store the runtime data of the object itself, Such as hashcode, GC generation age, lock status flag, lock held by thread, biased thread ID, biased timestamp, etc. Java object headers generally occupy two machine codes (in a 32-bit virtual machine, one machine code is equal to 4 bytes, that is, 32bit). However, if the object is an array type, three machine codes are required. Because the JVM virtual machine can determine the size of the Java object through the metadata information of the Java object, but cannot confirm the size of the array from the metadata of the array, so it uses one piece to record the length of the array. The following figure is the header of the Java object Storage structure (32-bit virtual machine):
Object header information is an additional storage cost independent of the data defined by the object itself. However, considering the space efficiency of the virtual machine, mark word is designed as a non fixed data structure to store as much data as possible in a very small space. It will reuse its own storage space according to the state of the object, that is, mark word will change with the operation of the program, The change status is as follows (32-bit virtual machine):
After a brief introduction to Java object headers, let's look at monitor.
Monitor
What is monitor? We can understand it as a synchronization tool or describe it as a synchronization mechanism. It is usually described as an object. Like all objects, all Java objects are natural monitors. Every Java object has the potential to become a monitor, because in Java design, every Java object brings an invisible lock from its birth, which is called an internal lock or monitor lock.
Monitor is a thread private data structure. Each thread has an available monitor record list and a global available list. Each locked object is associated with a monitor (the lockword in the markword of the object header points to the starting address of the monitor). At the same time, an owner field in the monitor stores the unique identification of the thread that owns the lock, indicating that the lock is occupied by this thread. Its structure is as follows:
Owner: initially null, indicating that no thread currently owns the monitorrecord. When the thread successfully owns the lock, the unique ID of the thread is saved. When the lock is released, it is set to null;
Entryq: associate a system mutually exclusive lock (semaphore) to block all threads that failed to lock monitorrecord.
Rcthis: indicates the number of all threads blocked or waiting on the monitorrecord.
Nest: the count used to implement the reentry lock.
Hashcode: save the hashcode value copied from the object header (possibly including gcage).
Candidate: used to avoid unnecessary blocking or waiting for the thread to wake up, because only one thread can successfully own the lock at a time. If the previous thread releasing the lock wakes up all blocked or waiting threads every time, It will cause unnecessary context switching (from blocking to ready, and then blocked due to contention lock failure), resulting in serious performance degradation. Candidate has only two possible values. 0 indicates that there is no thread to wake up, and 1 indicates that a successor thread will wake up to compete for lock.
Reference: talk about synchronized in Java concurrency