Detailed explanation of Java CAS underlying implementation principle examples
This article mainly introduces the detailed example of the underlying implementation principle of Java CAS. The example code is introduced in great detail, which has a certain reference value for everyone's study or work. Friends in need can refer to it
1、 Concept of CAS (compare and swap)
CAS, fully known as compare and swap, is a mechanism to solve the performance loss caused by using locks in the case of multi-threaded parallelism.
CAS (V, a, b). V is the memory address, a is the expected original value, and B is the new value. If the value of the memory address matches the expected original value, update the location value to the new value. Otherwise, it indicates that it has been updated by other threads, and the processor does not do anything; in either case, it will return the value of the location before the CAS instruction. We can use spin lock to cycle CAS, Reread the variable and try to modify it again, or you can abandon the operation.
2、 Generation of CAS (compareandswap)
Why CAS mechanism is needed? Let's start with a wrong phenomenon. We often use volatile keyword to modify a variable, indicating that the variable is a globally shared variable with visibility and order. But it's not atomic. For example, a common operation, a + +. This operation can be divided into three steps:
(1) Read a from memory
(2) Add 1 to a
(3) Re write the value of a to memory
There is no problem with this operation in the single thread state, but there will be various problems in multithreading. Because one thread may have added 1 to a and read the old value before writing to memory. It causes thread insecurity.
The volatile keyword can ensure the visibility and orderability of shared variables between threads and prevent the reordering of CPU instructions (DCL singleton), but it can not guarantee the atomicity of operations, so jdk1 After 5, CAS is introduced to ensure the security of thread operation by using CPU primitives.
CAS operation is supported by the processor and is a primitive. Primitives are the terms of operating system or computer network. It is a process composed of several instructions and used to complete certain functions. It is indivisible, that is, the execution of primitives must be continuous and cannot be interrupted in the execution process. For example, Intel processors compare and exchange instructions through the cmpxchg series.
3、 Research on the principle of CAS (compare and swap)
CAS is mainly implemented in the atomic package in JUC. Let's take the atomicinteger class as an example:
Through code tracing, it can be seen that CAS operations in Java are implemented through the unsafe class under the sun package, and the methods in the unsafe class are native methods, which are implemented locally by the JVM, so the final implementation is based on C and C + + operating on the operating system
Unsafe class, in sun Under misc package, it does not belong to Java standard. Unsafe class provides a series of operations to increase the ability of Java language, such as memory management, operation class / object / variable, multithread synchronization, etc
//var1为CAS操作的对象,offset为var1某个属性的地址偏移值,expected为期望值,var2为要设置的值,利用JNI来完成cpu指令的操作 public final native boolean compareAndSwapObject(Object var1,long var2,Object var4,Object var5); public final native boolean compareAndSwapInt(Object var1,int var4,int var5); public final native boolean compareAndSwapLong(Object var1,long var4,long var6); public native Object getObjectVolatile(Object var1,long var2); public native void putObjectVolatile(Object var1,Object var4);
Hotspot源码中关于unsafe的实现hotspot\src\share\vm\prims\unsafe.cpp UNSAFE_ENTRY(jboolean,Unsafe_CompareAndSwapInt(jnienv *env,jobject unsafe,jobject obj,jlong offset,jint e,jint x)) UnsafeWrapper("Unsafe_CompareAndSwapInt"); oop p = JNIHandles::resolve(obj);根据偏移量,计算value的地址。这里的offset就是 AtomaicInteger中的valueOffset jint* addr = (jint *) index_oop_from_field_offset_long(p,offset); return (jint)(Atomic::cmpxchg(x,addr,e)) == e; UNSAFE_END\hotspot\src\share\vm\runtime\atomic.cppunsigned Atomic::cmpxchg(unsigned int exchange_value,volatile unsigned int* dest,unsigned int compare_value) { assert(sizeof(unsigned int) == sizeof(jint),"more work to do"); return (unsigned int)Atomic::cmpxchg((jint)exchange_value,(volatile jint*)dest,(jint)compare_value); }根据操作系统类型调用不同平台下的重载函数,这个在预编译期间编译器会决定调用哪个平台下的重载
You can see that the "atomic:: cmpxchg" method is called, and the "atomic:: cmpxchg" method is in Linux_ X86 and windows_ X86 is implemented as follows
linux_x86底层实现\hotspot\src\os_cpu\linux_x86\vm\atomic_linux_x86.inline.hpp inline jint Atomic::cmpxchg (jint exchange_value,volatile jint* dest,jint compare_value) { int mp = os::is_MP(); __asm__ volatile (LOCK_IF_MP(%4) "cmpxchgl %1,(%3)" : "=a" (exchange_value) : "r" (exchange_value),"a" (compare_value),"r" (dest),"r" (mp) : "cc","memory"); return exchange_value; }
windows_x86底层实现 hotspot\src\os_cpu\windows_x86\vmatomic_linux_x86.inline.hpp inline jint Atomic::cmpxchg (jint exchange_value,jint compare_value) { // alternative for InterlockedCompareExchange int mp = os::is_MP(); __asm { mov edx,dest mov ecx,exchange_value mov eax,compare_value LOCK_IF_MP(mp) cmpxchg dword ptr [edx],ecx } }
Summary: according to the data query, the underlying implementation of CAS will have different overloads according to different operating systems, and the implementation of CAS is inseparable from the support of processors.
The core code is a cmpxchg instruction with lock prefix, namely lock cmpxchg DWORD PTR [EDX], ECX
Atomic:: cmpxchg method parsing:
MP is the return result of "OS:: is_mp()", and "OS:: is_mp()" is an inline function to judge whether the current system is multiprocessor.
If the current system is multiprocessor, the function returns 1.
Otherwise, 0 is returned.
LOCK_ IF_ MP (MP) determines whether to add the lock prefix to the cmpxchg instruction according to the value of MP.
If the MP determines that the current system is a multiprocessor (that is, the MP value is 1), add the lock prefix to the cmpxchg instruction.
Otherwise, do not prefix lock.
This is an optimization method. It is considered that it is not necessary to add lock prefix in a single processor environment. Lock prefix will be added only in the case of multi-core, because lock will lead to performance degradation. Cmpxchg is an assembly instruction that compares and exchanges operands.
4、 Advantages and disadvantages of CAS mechanism
4.1 advantages
CAS is not only an optimistic lock, but also a non blocking lightweight optimistic lock. What is non blocking? In fact, if a thread wants to obtain a lock, the other party will give a response to indicate whether the lock can be obtained. The performance is high when the resource competition is not fierce. Compared with synchronized weight lock, synchronized will perform more complex locking, unlocking and wake-up operations.
4.2 disadvantages
1) The cycle time is long, the overhead is large, and the CPU resources are occupied
2) Only atomic operations of one shared variable can be guaranteed
3) ABA problem
4.3 solving ABA problems
1) Add version number
2)AtomicStampedReference
In order to solve this problem, Java provides a marked atomic reference class "atomicstampedreference", which can ensure the correctness of CAS by controlling the version of variable value. Therefore, before using CAS, it is necessary to consider whether the "ABA" problem will affect the correctness of program concurrency. If the ABA problem needs to be solved, using traditional mutually exclusive synchronization may be more efficient than atomic class.
5、 Timing of CAS use
5.1 with fewer threads and short waiting time, spin lock can be used for CAS to try to get the lock, which is more efficient than synchronized
5.2 due to the large number of threads and long waiting time, it is not recommended to use spin lock, which occupies a high CPU
The above is the whole content of this article. I hope it will help you in your study, and I hope you will support us a lot.