Java Memory Model (JMM) and happens before

Java Memory Model (JMM) and happens before

We know that Java programs run in the JVM, and the JVM is a virtual machine built on memory. What is the purpose of the memory model JMM?

We consider a simple assignment problem:

int a=100;

JMM considers when the thread reading variable a can see the value of 100. It seems that this is a very simple problem. Can't you read the value after assignment?

But the above is only the writing order of our source code. After compiling the source code, the order of instructions generated in the compiler is not completely consistent with the order of the source code. The processor may execute instructions out of order or in parallel (in the JVM, this reordering is allowed as long as the final execution result of the program is consistent with that in the strict serial environment). Moreover, the processor also has a local cache. When the results are stored in the local cache, other threads cannot see the results. In addition, the order in which the cache is submitted to the main memory may also change.

All of the possibilities mentioned above can lead to different results in a multithreaded environment. In a multithreaded environment, multithreads perform their own tasks most of the time. Only when multiple threads need to share data, do they need to coordinate the operation between threads.

JMM is a set of minimum guarantees that must be observed in the JVM. It specifies when the write operation of variables is visible to other threads.

Reorder

Reordering in the JVM is described above. Here we give an example to help you have a deeper understanding of reordering:

@Slf4j
public class Reorder {

    int x=0,y=0;
    int a=0,b=0;

    private  void reorderMethod() throws InterruptedException {

        Thread one = new Thread(()->{
            a=1;
            x=b;
        });

        Thread two = new Thread(()->{
            b=1;
            y=a;
        });
        one.start();
        two.start();
        one.join();
        two.join();
        log.info("{},{}",x,y);
    }

    public static void main(String[] args) throws InterruptedException {

        for (int i=0; i< 100; i++){
            new Reorder().reorderMethod();
        }
    }
}

The above example is a very simple concurrent program. Since we do not use synchronization restrictions, the execution order of threads one and two is uncertain. One may be executed before two, or after two, or both. Different execution sequences may lead to different output results.

At the same time, although we specify to execute a = 1 first and then x = B in the code, these two statements are actually irrelevant. In the JVM, it is entirely possible to reorder the two statements into x = B first and a = 1 second, resulting in more unexpected results.

Happens-Before

In order to ensure the order of operations in the JAVA memory model, JMM defines an order relationship for all operations in the program, which is called happens before. To ensure that operation B sees the result of operation a, whether a and B are on the same thread or different threads, a and B must satisfy the happens before relationship. If the two operations do not satisfy the happens before relationship, the JVM can reorder them arbitrarily.

Let's take a look at the rules of happens before:

The above rule 2 is well understood. In the process of locking, other threads are not allowed to obtain the lock, which also means that other threads must wait for the lock to be released before locking and executing their business logic.

Rules 4, 5, 6 and 7 are also well understood. Only the beginning can end. This is in line with our general understanding of procedures.

It is believed that people who have studied mathematics should not be difficult to understand.

Next, let's focus on the combination of rule 3 and rule 1. Before the discussion, let's summarize what happens before does.

Because the JVM will reorder the received instructions, we have the happens before rule to ensure the execution order of the instructions. Rules 2, 3, 4, 5, 6 and 7 mentioned above can be regarded as reordering nodes. Reordering is not allowed for these nodes. Reordering is allowed only for instructions between these nodes.

Combined with the program sequence rule of rule 1, we get its real meaning: the instructions written in the code before the reordering node must be executed before the reordering node is executed.

The reordering node is a dividing point, and its position cannot be moved. Take a look at the following visual example:

There are two instructions in thread 1: set I = 1 and set volatile a = 2. There are also two instructions in thread 2: get volatile A and get I.

According to the above theory, set and get volatile are two reordering nodes, and set must precede get. According to rule 1, set I = 1 in the code is before set volatile a = 2. Because set volatile is a reordering node, it is necessary to abide by the program sequence execution rules, so set I = 1 should be executed before set volatile a = 2. Similarly, get volatile A is executed before get I. Finally, I = 1 is executed before get I.

This operation is called synchronization.

Security release

We often use singleton mode to create a singleton object. Let's see what's wrong with the following methods:

public class Book {

    private static Book book;

    public static Book getBook(){
        if(book==null){
            book = new Book();
        }
        return book;
    }
}

The above class defines a getbook method to return a new book object. Before returning the object, we first judge whether the book is empty. If not, we will create a new book object.

At first glance, there seems to be no problem, but if you carefully consider JMM's rearrangement rules, you will find the problem. Book = new book() is actually a complex command, not an atomic operation. It can be roughly divided into 1. Allocating memory, 2. Instantiating objects, and 3. Associating objects with memory addresses.

Among them, 2 and 3 may be reordered, and then the book may return, but the initialization has not been completed. Thus, unforeseen errors occur.

According to the happens before rule mentioned above, the simplest way is to add the synchronized keyword in front of the method:

public class Book {

    private static Book book;

    public synchronized static Book getBook(){
        if(book==null){
            book = new Book();
        }
        return book;
    }
}

Let's look at the implementation of the following static domain:

public class BookStatic {
    private static BookStatic bookStatic= new BookStatic();

    public static BookStatic getBookStatic(){
        return bookStatic;
    }
}

The JVM will perform static initialization after the class is loaded and before it is used by threads, and a lock will be obtained in this initialization stage, so as to ensure that the memory write operation will be visible to all threads in the static initialization stage.

The above example defines the static variable, which will be instantiated in the static initialization phase. This method is called early initialization.

Next, let's look at a mode of delaying the initialization of the placeholder class:


public class BookStaticLazy {

    private static class BookStaticHolder{
        private static BookStaticLazy bookStatic= new BookStaticLazy();
    }

    public static BookStaticLazy getBookStatic(){
        return BookStaticHolder.bookStatic;
    }
}

In the above class, the class is initialized only when the getbookstatic method is called.

Next, let's talk about double check locking.

public class BookDLC {
    private volatile static BookDLC bookDLC;

    public static BookDLC getBookDLC(){
        if(bookDLC == null ){
            synchronized (BookDLC.class){
                if(bookDLC ==null){
                    bookDLC=new BookDLC();
                }
            }
        }
        return bookDLC;
    }
}

The value of bookdlc is detected twice in the above class. The lock operation is only performed when bookdlc is empty. It seems that everything is perfect, but we should note that the bookdlc here must be volatile.

Because the assignment and return operations of bookdlc do not have happens before, it is possible to get an instance that is only partially constructed. That's why we add the volatile keyword.

Initialize security

At the end of this article, we will discuss object initialization with final field in constructor.

For correctly constructed objects, the initialization object ensures that all threads can correctly see the correct values set by the constructor for each final field, including any variables that can be reached by the final field (such as elements in the final array, HashMap of final, etc.).

public class FinalSafe {
    private final HashMap<String,String> hashMap;

    public FinalSafe(){
        hashMap= new HashMap<>();
        hashMap.put("key1","value1");
    }
}

In the above example, we defined a final object and initialized it in the constructor. Then the final object will not be reordered with other operations after the constructor.

Examples of this article can be referred to https://github.com/ddean2009/learn-java-concurrency/tree/master/reorder

For more information, please visit flybean's blog

The content of this article comes from the network collection of netizens. It is used as a learning reference. The copyright belongs to the original author.
THE END
分享
二维码
< <上一篇
下一篇>>