Source code analysis of concurrenthashmap

2021-03-12 • Java

Source code analysis of concurrenthashmap

1. Preface

I finally got to this class. In fact, I have used this class many times before, because this class has a large amount of code and involves concurrency. Another point is that this code is really obscure and difficult to understand. I spent about three days reading some important operations before and after, and then I'll sort them out today.

OK, first introduce a personal feeling:

2. Basic structure

1. Succession

AbstractMap

2. Realization

Concurrentmap < K, V > and serializable have no clones. It should also be for safety reasons!

3. Properties

1. Basic attributes

    // 最大容量，同 hashmap
    private static final int MAXIMUM_CAPACITY = 1 << 30;
    // 默认大小  16
    private static final int DEFAULT_CAPACITY = 16;
    // 数组的最大值
    static final int MAX_ARRAY_SIZE = Integer.MAX_VALUE - 8;
    // 并发数，不可修改   默认 16
    private static final int DEFAULT_CONCURRENCY_LEVEL = 16;
    // 负载因子 0.75
    private static final float LOAD_FACTOR = 0.75f;
    // 转成树链表最大长度 8
    static final int TREEIFY_THRESHOLD = 8;
    // 转链表的节点数
    static final int UNTREEIFY_THRESHOLD = 6;
    // 最小转换步长 16 常量不可修改
    private static final int MIN_TRANSFER_STRIDE = 16;
    //
    private static int RESIZE_STAMP_BITS = 16;
    // 最大  transfer helper 数 默认 2^16-1
    private static final int MAX_RESIZERS = (1 << (32 - RESIZE_STAMP_BITS)) - 1;
    //
    private static final int RESIZE_STAMP_SHIFT = 32 - RESIZE_STAMP_BITS;
    // 标志位
    static final int MOVED = -1; // 正在 transfer 的节点的 hash 值
    static final int TREEBIN = -2; //  树根的 hash 值
    static final int RESERVED = -3; // 不进行序列化的 hash 值
    // 可用 cpu 数
    static final int Ncpu = Runtime.getRuntime().availableProcessors();
    // 底层的数组 注意是 volatile 的
    transient volatile Node<K,V>[] table;
    // 在 resize 的时候使用的数组，只有在 resize 的时候才不是 null  一样volatile
    private transient volatile Node<K,V>[] nextTable;

    // 在没有竞争的时候使用，或者在初始化的时候作为一个反馈
    private transient volatile long baseCount;

    // sizeCtl 是一个控制标志
    // -1 表示初始化
    // -N 表示有 N-1 个线程在 resize
    // 正数或0代表hash表还没有被初始化，这个数值表示初始化
    // 大于 0 表示下一次进行扩容的时候的阈值
    // 它的值始终是当前ConcurrentHashMap容量的0.75倍，这与loadfactor是对应的。
    private transient volatile int sizeCtl;

In particular, pay attention to the sizecl. Now I don't know what he means, and I may not understand it later!!

2. Node

The same as HashMap. But note that the volatile keyword is used in this place. Other keywords are really the same. Only two are volatile because only these two are variable and will change.

    static class Node<K,V> implements Map.Entry<K,V> {
        final int hash;
        final K key;
        volatile V val;
        volatile Node<K,V> next;
        
        // hashcode 比较有意思 HashMap 中使用了 Objects.hashCode(key) ^ Objects.hashCode(value)
        // 但是 Objects 里面还是调用了 key 和 val 的 hashCode 所以原理一样
        public final int hashCode() {
            return key.hashCode() ^ val.hashCode();
        }


        // 虽然 val 是 volatile 的变量但是不提供修改方法，否则抛异常  
        public final V setValue(V value) {
            throw new UnsupportedOperationException();
        }
        }

3. Segment

this data structure used to be a particularly important data structure. Basically all query, deletion and modification depend on it, but its role has been weakened in 1.8, and there are no methods in it.

static class Segment<K,V> extends reentrantlock implements Serializable {
        private static final long serialVersionUID = 2249069246763182397L;
        final float loadFactor;

        Segment(float lf) {
            this.loadFactor = lf;
        }
    }

4. TreeNode

Because it directly inherits the node node, it has fewer before and after attributes than the treenode in HashMap. There are few methods, mainly because the red black tree does not use this data structure, but uses treebin. However, it exists because the node is encapsulated into treenode and then encapsulated into treebin when it is transformed into red black tree.

 static final class TreeNode<K,V> extends Node<K,V> {
        TreeNode<K,V> parent;  // red-black tree links
        TreeNode<K,V> left;
        TreeNode<K,V> right;
        TreeNode<K,V> prev;    // needed to unlink next upon deletion
        boolean red;
        }

5. TreeBin

 static final class TreeBin<K,V> root;  //用了上文中的 TreeNode
        volatile TreeNode<K,V> first;
        volatile Thread waiter;
        volatile int lockState;
        // values for lockState
        static final int WRITER = 1; // set while holding write lock
        static final int WAITER = 2; // set when waiting for write lock
        static final int READER = 4; // increment value for setting read lock
        
        //  大量的 rb-tree 的方法  不分析了
        }

4. Overview of main methods

1. ctor-4

the construction method is similar to HashMap. In the construction method, only parameter judgment and assignment of some attributes are performed, but the array is not initialized. The same principle: delay loading, and the array will be initialized in the put method. the value mainly set here must be the sizectl attribute mentioned earlier. When it is an integer, it is the threshold. We also see the threshold field, but we don't see that field used. It is mainly implemented by sizectl! The other attributes that need to be set are the load factor and the initial array size. The default values are 0.75 and 16. OK, let's talk about the implementation of each method:

// 空方法，注释说会创建大小为 16 的数组，估计是延时加载 在 put 里面
    public ConcurrentHashMap() {
    }

    // 设置了 sizeCtl 就是下一次扩容的容量
    public ConcurrentHashMap(int initialCapacity) {
        if (initialCapacity < 0)
            throw new IllegalArgumentException();
        // 如果要开的数组比最大的一半还大，那就直接分配最大容量
        // 否则分配 1.5n+1 向上取 2^n
        int cap = ((initialCapacity >= (MAXIMUM_CAPACITY >>> 1)) ? MAXIMUM_CAPACITY : tableSizeFor(initialCapacity + (initialCapacity >>> 1) + 1));
        this.sizeCtl = cap;
    }

    // 同 HashMap
    public ConcurrentHashMap(Map<? extends K,? extends V> m) {
        this.sizeCtl = DEFAULT_CAPACITY;
        putAll(m);
    }


    public ConcurrentHashMap(int initialCapacity,float loadFactor) {
        this(initialCapacity,loadFactor,1);
    }

    // 设置容量和负载因子
    public ConcurrentHashMap(int initialCapacity,float loadFactor,int concurrencyLevel) {
        if (!(loadFactor > 0.0f) || initialCapacity < 0 || concurrencyLevel <= 0)
            throw new IllegalArgumentException();
        // 容量不能小于并发数
        if (initialCapacity < concurrencyLevel)   // Use at least as many bins
            initialCapacity = concurrencyLevel;   // as estimated threads
        long size = (long) (1.0 + (long) initialCapacity / loadFactor);
        int cap = (size >= (long) MAXIMUM_CAPACITY) ?
                MAXIMUM_CAPACITY : tableSizeFor((int) size);
        this.sizeCtl = cap;
    }

2. size/sumCount

The size method directly calls sumcount, and then sumcount is used to count the sum of the values in the cellcount array. At this time, we will find that cellcount is a class, and we have defined a static internal class, which is used to count the number of nodes. It can be seen that it is very difficult to count nodes in concurrency. Here, a class is specially designed for statistics. There is only one field, volatile long.

    public int size() {
        long n = sumCount();
        // 如果有一些奇怪的值，比如大于最大小于最小就设置为最大，或者最小
        // 否则就是正常的值
        return ((n < 0L) ? 0 : (n > (long) Integer.MAX_VALUE) ? Integer.MAX_VALUE : (int) n);
    }

    static final class CounterCell {
        volatile long value;

        CounterCell(long x) {
            value = x;
        }
    }

    final long sumCount() {
        CounterCell[] as = counterCells;
        CounterCell a;
        long sum = baseCount;
        if (as != null) {
            for (int i = 0; i < as.length; ++i) {
                if ((a = as[i]) != null)
                    sum += a.value;
            }
        }
        return sum;
    }

3. get

the get method is similar to HashMap, because we will find that it does not add chains. We will directly follow the previous routine, first calculate the hash, and then find the node of the array to see whether the first one is a red black tree or a linked list. They use sequential search and a while loop. then we need to consider why we don't need to lock. Will there be no security problem when we read elements while other threads are writing? For example, when we are traversing a linked list to find elements, a thread happens to insert at the end of the linked list, or when we read the end of the linked list, the writing thread does not update the end element of the linked list, so we think that there is no safety problem if reading precedes writing, because we will not find that the tail node pointer changes when reading, In short, the node pointer operation of the write thread is atomic and visible to other threads. At this time, you should know why the next of the linked list is volatile.

public V get(Object key) {
        Node<K,V>[] tab;
        Node<K,V> e,p;
        int n,eh;
        K ek;
        int h = spread(key.hashCode());
        if ((tab = table) != null && (n = tab.length) > 0 && (e = tabAt(tab,(n - 1) & h)) != null) {
            // 判断头结点是不是
            if ((eh = e.hash) == h) {
                if ((ek = e.key) == key || (ek != null && key.equals(ek)))
                    return e.val;
            // 红黑树查找
            } else if (eh < 0)
                return (p = e.find(h,key)) != null ? p.val : null;
            // 顺序查找
            while ((e = e.next) != null) {
                if (e.hash == h && ((ek = e.key) == key || (ek != null && key.equals(ek))))
                    return e.val;
            }
        }
        return null;
    }

4. Containskey / value (traverser object)

containskey is relatively simple, because we can directly call the get method to judge. Therefore, its implementation is to call the get method. containsvalue method must traverse nodes, because it needs to compare the values of elements one by one. Instead of traversing like get, a traverser is used. The main function of this traverser is to switch to the new array and obtain the latest inserted value if it is found that it is transferring.

 public boolean containsKey(Object key) {
        return get(key) != null;
    }

    public boolean containsValue(Object value) {
        if (value == null)
            throw new NullPointerException();
        Node<K,V>[] t;
        if ((t = table) != null) {
            Traverser<K,V> it = new Traverser<K,V>(t,t.length,t.length);
            for (Node<K,V> p; (p = it.advance()) != null; ) {
                V v;
                if ((v = p.val) == value || (v != null && value.equals(v)))
                    return true;
            }
        }
        return false;
    }

5. put/putVal

well, now we're ready to introduce the most important put method. Putval is called directly in this method. In fact, the logic of putval is a little complex. Looking at the code alone, the amount of code is still relatively large. let's first introduce the general logic of putval, and then look at the code.

    // 直接调用
    public V put(K key,V value) {
        return putVal(key,value,false);
    }

    final V putVal(K key,V value,boolean onlyIfAbsent) {
        // 不允许 null 键值 解释是在我们没办法判断是没有对应的值还是值为 null 。
        // 在非并发的情况下我们可以使用 containsKey/Value(null) 来明确知道 是不是有 null key/val
        // 这也就解释了为什么 hashmap 在查找的时候采用了先使用 null 来查找的策略
        // 但是并发的话，他底层调用了 get 方法，而 get 方法不是同步的，有可能会发生改变产生歧义
        if (key == null || value == null) throw new NullPointerException();
        // 得到 hash 值
        int hash = spread(key.hashCode());
        // 用于记录相应链表的长度
        int binCount = 0;
        for (Node<K,V>[] tab = table; ; ) {
            Node<K,V> f;
            int n,i,fh;
            // 如果数组为 null 也就是没有初始化(延时加载)或者数组没有元素，进行数组初始化
            if (tab == null || (n = tab.length) == 0)
                // 初始化数组，后面会详细介绍
                tab = initTable();
            // 找该 hash 值对应的数组下标，得到第一个节点 f
            else if ((f = tabAt(tab,i = (n - 1) & hash)) == null) {
                // 如果数组该位置为空， 用一次 CAS 操作将这个新值放入其中即可，这个 put 操作差不多就结束了，可以拉到方法的最后面了
                // 如果 CAS 成功，那就是没有并发操作 方法可以结束了  有并发操作就进行下一次循环，注意外面的循环是死循环
                if (casTabAt(tab,null,new Node<K,V>(hash,key,null)))
                    break;                   // no lock when adding to empty bin
            }
            // hash 居然可以等于 MOVED，这个需要到后面才能看明白，不过从名字上也能猜到，肯定是因为在扩容
            else if ((fh = f.hash) == MOVED)
                // 帮助数据迁移，这个等到看完数据迁移部分的介绍后，再理解这个就很简单了
                tab = helpTransfer(tab,f);
            // 到这里就是说，f 是该位置的头结点，而且不为空  也就是一般情况
            else {
                V oldVal = null;
                // 获取数组该位置的头结点的监视器锁
                synchronized (f) {
                    // 判断是否有新的节点插入到头部，或者删除头部节点造成不匹配  进行下一次循环
                    if (tabAt(tab,i) == f) {
                        if (fh >= 0) { // 头结点的 hash 值大于等于 0，说明是链表
                            binCount = 1;
                            // 遍历链表
                            for (Node<K,V> e = f; ; ++binCount) {
                                K ek;
                                // 如果发现了"相等"的 key，直接覆盖他的 value ，方法结束
                                if (e.hash == hash && ((ek = e.key) == key || (ek != null && key.equals(ek)))) {
                                    oldVal = e.val;
                                    if (!onlyIfAbsent)
                                        e.val = value;
                                    break;
                                }
                                // 到了链表的最末端，将这个新值放到链表的最后面
                                Node<K,V> pred = e;
                                if ((e = e.next) == null) {
                                    pred.next = new Node<K,null);
                                    break;
                                }
                            }
                        } else if (f instanceof TreeBin) { // 红黑树
                            Node<K,V> p;
                            binCount = 2;
                            // 调用红黑树的插值方法插入新节点
                            if ((p = ((TreeBin<K,V>) f).putTreeVal(hash,value)) != null) {
                                oldVal = p.val;
                                if (!onlyIfAbsent)
                                    p.val = value;
                            }
                        }
                    }
                }
                // 退出链表/红黑树的操作  的同步代码块
                // binCount != 0 说明上面在做链表操作
                if (binCount != 0) {
                    // 判断是否要将链表转换为红黑树，临界值和 HashMap 一样，也是 8
                    if (binCount >= TREEIFY_THRESHOLD)
                        // 这个方法和 HashMap 中稍微有一点点不同，那就是它不是一定会进行红黑树转换，
                        // 如果当前数组的长度小于 64，那么会选择进行数组扩容，而不是转换为红黑树
                        // 具体源码我们就不看了，扩容部分后面说
                        treeifyBin(tab,i);
                    if (oldVal != null)
                        return oldVal;
                    break;
                }
            }
        }
        addCount(1L,binCount);
        return null;
    }

6. initTable

well, the put method has been analyzed above. We still have two problems to solve. One is to initialize the table, and the other is to turn it into a red black tree. First analyze the initialization of the table. Let's first review HashMap. It uses the same delayed loading method, and then initializes the table in the resize method. That is, the resize method of HashMap performs initialization, capacity expansion and migration at the same time. The division in concurrenthashmap is more detailed. Initialization is only initialization because it is concurrent, and more needs to be considered. similarly, let's first analyze the general idea.

    private final Node<K,V>[] initTable() {
        Node<K,V>[] tab;
        int sc;
        while ((tab = table) == null || tab.length == 0) {
            // 别的线程已经初始化好了或者正在初始化 sizeCtl 为 -1
            if ((sc = sizeCtl) < 0)
                Thread.yield(); // 让出线程的执行权
            // CAS 一下，将 sizeCtl 设置为 -1，代表抢到了锁 基本在这就相当于同步代码块
            else if (U.compareAndSwapInt(this,SIZECTL,sc,-1)) {
                try {
                    if ((tab = table) == null || tab.length == 0) {
                        // DEFAULT_CAPACITY 默认初始容量是 16
                        int n = (sc > 0) ? sc : DEFAULT_CAPACITY;
                        // 初始化数组，长度为 16 或初始化时提供的长度
                        Node<K,V>[] nt = (Node<K,V>[]) new Node<?,?>[n];
                        // 将这个数组赋值给 table，table 是 volatile 的  他的写发生在别人的读之前
                        table = tab = nt;
                        // 如果 n 为 16 的话，那么这里 sc = 12 其实就是 0.75 * n
                        sc = n - (n >>> 2);
                    }
                } finally {
                    // 设置下次扩容的时候的阈值
                    sizeCtl = sc;
                }
                break;
            }
        }
        return tab;
    }

7. treeifyBin

line, the initialization is solved. Next, we need to look at the operation of converting the linked list to the red black tree. This operation does not directly convert the tree as I thought, but makes a series of judgments. when the length of the linked list is greater than or equal to 16 but the length of the array is less than 64, a capacity expansion operation needs to be performed, and the capacity expansion operation is entrusted to tryprevize. The capacity expansion is twice the expected capacity. Pay attention to distinguish between linked list length and array length. Don't mix them up!!! the next step is to convert the real linked list into a tree.

private final void treeifyBin(Node<K,V>[] tab,int index) {
        Node<K,V> b;
        int n,sc;
        if (tab != null) {
            // MIN_TREEIFY_CAPACITY 为 64
            // 所以，如果数组长度小于 64 的时候，其实也就是 32 或者 16 或者更小的时候，会进行数组扩容
            if ((n = tab.length) < MIN_TREEIFY_CAPACITY)
                // 扩容操作 大小扩容为 (3*n+1)--> 2^n
                tryPresize(n << 1);
                // b 是头结点
            else if ((b = tabAt(tab,index)) != null && b.hash >= 0) {
                // 加锁
                synchronized (b) {

                    if (tabAt(tab,index) == b) {
                        // 下面就是遍历链表，建立一颗红黑树
                        TreeNode<K,V> hd = null,tl = null;
                        for (Node<K,V> e = b; e != null; e = e.next) {
                            TreeNode<K,V> p =
                                    new TreeNode<K,V>(e.hash,e.key,e.val,null);
                            if ((p.prev = tl) == null)
                                hd = p;
                            else
                                tl.next = p;
                            tl = p;
                        }
                        // 将红黑树设置到数组相应位置中
                        setTabAt(tab,index,new TreeBin<K,V>(hd));
                    }
                }
            }
        }
    }

8. tryPresize

continue to analyze the problems left over from the above and expand the capacity.

it looks like the old HashMap routine on the whole. Its resize performs initialization, capacity expansion and data migration. Here, these three steps are divided into three methods to ensure high concurrency.

private final void tryPresize(int size) {
        // c：size*1.5+1 ，再往上取最近的 2 的 n 次方。
        int c = (size >= (MAXIMUM_CAPACITY >>> 1)) ? MAXIMUM_CAPACITY : tableSizeFor(size + (size >>> 1) + 1);
        int sc;
        // sizeCtl > 0 表示下次扩容的阈值
        while ((sc = sizeCtl) >= 0) {
            Node<K,V>[] tab = table;
            int n;
            // 数组初始化
            // 这个 if 分支和之前说的初始化数组的代码基本上是一样的，在这里，我们可以不用管这块代码
            if (tab == null || (n = tab.length) == 0) {
                n = (sc > c) ? sc : c;
                if (U.compareAndSwapInt(this,-1)) {
                    try {
                        if (table == tab) {
                            @SuppressWarnings("unchecked")
                            Node<K,?>[n];
                            table = nt;
                            sc = n - (n >>> 2); // 0.75 * n
                        }
                    } finally {
                        sizeCtl = sc;
                    }
                }
            // 如果扩容后的值还小于等于阈值或者，当前数组长度已经达到最大了 不进行扩容
            } else if (c <= sc || n >= MAXIMUM_CAPACITY)
                break;
            // 进行扩容操作
            else if (tab == table) {
                int rs = resizeStamp(n);
                // 不可能发生
                if (sc < 0) {
                    Node<K,V>[] nt;
                    if ((sc >>> RESIZE_STAMP_SHIFT) != rs ||
                            sc == rs + 1 ||
                            sc == rs + MAX_RESIZERS ||
                            (nt = nextTable) == null ||
                            transferIndex <= 0
                            )
                        break;
                    if (U.compareAndSwapInt(this,sc + 1))
                        transfer(tab,nt);
                }
                // 1. 将 SIZECTL 设置为 (rs << RESIZE_STAMP_SHIFT) + 2  这个值等于
                //  (n的32位二进制中空位数 << 16 ) + 2 肯定是正数 也就是下次扩容的阈值
                //  调用 transfer 方法，此时 nextTab 参数为 null

                // 2. 这个 cas 操作保证了只有一个线程会最先进行 transfer 操作
                else if (U.compareAndSwapInt(this,(rs << RESIZE_STAMP_SHIFT) + 2))
                    transfer(tab,null);
            }
        }
    }

9. transfer

well, it's time to get to the point. Transfer is the most troublesome and sophisticated method in this class. Let's talk about ideas first and then read the code.

OK, I may feel a little confused now. Let's summarize. The current array migration is divided into many task packages. Each task package has stripe elements, and then these task packages need to be allocated to different threads from back to front. The allocation process depends on the shared global variable transferindex. The reason for this is for high concurrency. I have to admire the master who wrote this kind of source code! Let's summarize the above contents with a picture@ H_ 508_ 404@

now that the thread has received its own task package, it certainly needs to migrate data. The migration work is relatively simple. Because it is necessary to operate the linked list or red black tree nodes, it is necessary to synchronize the process, lock or head node. During node migration, just like HashMap, the original linked list and tree are divided into two parts and placed on I and I + n respectively. we mainly look at the splitting of the linked list. We use 0 or 1 to determine the location of the hash value. The header insertion method adopted by the nodes, that is, the order of some nodes is reverse. Why is part of it the opposite? That's because he is doing one thing in a large piece of code. He looks for the last paragraph of the original linked list to locate the same complete sequence and keep the order unchanged. This is difficult to explain. Let's use a picture to illustrate it@ H_ 508_ 404@

// 把节点拷贝到新数组里面
    // 调用情况： 1.在 put 中的转红黑树的时候，如果大小不足 64 试着扩容 时候调用了 transfer
    //           2.在调用 put 完成以后，有一个 addCount 操作里面也调用了 transfer
    //           3.helpTransfer 中
    private final void transfer(Node<K,Node<K,V>[] nextTab) {
        int n = tab.length,stride;
        // 将这些节点分成若干个任务迁移，每一个任务里面的节点数就是 stride
        // 所以当只有一个 cpu 的时候就是一个任务，多个 cpu 按下面的规则
        if ((stride = (Ncpu > 1) ? (n >>> 3) / Ncpu : n) < MIN_TRANSFER_STRIDE)
            stride = MIN_TRANSFER_STRIDE; // subdivide range

        // 如果 nextTab 为 null，先进行一次初始化，在属性字段里面我们就看到了这个字段是只有在迁移的时候不是 null
        // 其他时间都是 null

        //由于前面的 cas 操作已经保证了只有一个线程进行 transfer 所以不担心每个线程都会 new 出自己的新数组
        if (nextTab == null) {
            try {
                // 容量翻倍，所以说前面那个 tryPresize 方法里面计算的希望要扩容的大小只是为了与阈值比较一下
                // 真正在扩容的时候只扩充为原来的 2 倍
                Node<K,?>[n << 1];
                nextTab = nt;
            } catch (Throwable ex) {      // try to cope with OOME
                sizeCtl = Integer.MAX_VALUE;
                return;
            }
            // volatile 赋值
            nextTable = nextTab;
            // 用于控制迁移的位置
            transferIndex = n;
        }
        // 新数组长度
        int nextn = nextTab.length;
        // 标志节点
        // 这个构造方法会生成一个Node，key、value 和 next 都为 null，关键是 hash 为 MOVED
        // 后面我们会看到，原数组中位置 i 处的节点完成迁移工作后，
        // 就会将位置 i 处设置为这个 ForwardingNode，用来告诉其他线程该位置已经处理过了
        ForwardingNode<K,V> fwd = new ForwardingNode<K,V>(nextTab);
        // advance 指的是做完了一个位置的迁移工作，可以准备做下一个位置的了
        boolean advance = true;
        // 在对 nextTable 赋值后记得清理节点  待会说
        boolean finishing = false;



        // 死循环
        for (int i = 0,bound = 0; ; ) {
            Node<K,V> f;
            int fh;

            // 说了这么多 很多都是废话，其实就一句话就是：我们从数组的结尾开始把元素分成任务，其中每一个任务的节点数就是 stride
            // 任务是从后往前分配的，也就是最先分出去的是数组结尾的那一段。任务开始的元素下标是 i 结束的下标是 bound 注意 bound < i
            // 因为是从后往前来的
            while (advance) {
                int nextIndex,nextBound;
                // 刚领取的任务完成了 也就是 i>=bound  一开始我找了半天的 --i 竟然藏在这里
                if (--i >= bound || finishing)
                    advance = false;
                // 任务领完了 transferIndex 本来是数组的长度，现在都成0 了说明任务都分派完了
                else if ((nextIndex = transferIndex) <= 0) {
                    i = -1;
                    advance = false;
                // 实施任务分派，更新了 TRANSFERINDEX 其他线程能看到
                } else if (U.compareAndSwapInt(this,TRANSFERINDEX,nextIndex,nextBound = (nextIndex > stride ? nextIndex - stride : 0))) {
                    bound = nextBound;
                    i = nextIndex - 1;
                    advance = false;
                }
            }

            // 主要就是判断 i<0 ， 也就是是不是都迁移完了  至于 i>=n 是 i = n 导致的
            if (i < 0 || i >= n || i + n >= nextn) {
                int sc;
                // 所有的迁移操作已经完成
                // 将新的 nextTab 赋值给 table 属性，完成迁移
                // 重新计算 sizeCtl = 1.5*n 下次的阈值 结束迁移方法
                if (finishing) {
                    nextTable = null;
                    table = nextTab;
                    sizeCtl = (n << 1) - (n >>> 1);
                    return;
                }

                // 之前我们说过，SIZECTL 在迁移前会设置为 (rs << RESIZE_STAMP_SHIFT) + 2
                // d但是怎么会出现这种情况？？？？？
                if (U.compareAndSwapInt(this,sc = sizeCtl,sc - 1)) {
                    // 任务结束，方法退出
                    if ((sc - 2) != resizeStamp(n) << RESIZE_STAMP_SHIFT)
                        return;

                    // 到这里，说明 (sizeCtl - 2) == resizeStamp(n) << RESIZE_STAMP_SHIFT，
                    // 也就是说，所有的迁移任务都做完了，也就会进入到上面的 if(finishing){} 分支了
                    finishing = advance = true;
                    i = n; // recheck before commit
                }
            }
            // 如果原数组 i 处是空的，没有任何节点，那么放入 ForwardingNode 作为标志
            else if ((f = tabAt(tab,i)) == null)
                advance = casTabAt(tab,fwd);
            // 该位置处是一个 ForwardingNode，代表该位置已经迁移过了
            else if ((fh = f.hash) == MOVED)
                advance = true; // already processed
            // 原数组处有值
            else {
                // 对数组该位置处的结点加锁，开始处理数组该位置处的迁移工作
                // 迁移干的事同样的事 就是把链表拆成两部分，树也分裂成两部分  和 HashMap 真的几乎一样
                synchronized (f) {
                    // 重复判断防止多线程修改
                    if (tabAt(tab,i) == f) {
                        Node<K,V> ln,hn;
                        // 头结点的 hash 大于 0，说明是链表的 Node 节点
                        if (fh >= 0) {
                            int runBit = fh & n;
                            Node<K,V> lastRun = f;
                            // p.hash & n 其实是取某一特定的位，只能是 0/1 所以这个做法是把原来一个链表分成两个
                            // lastRun 指向最后一段 runBit 相同的连续的节点的开始
                            // 感觉这一段代码写得有点蠢，就是为了找到最后一段完整的同类型的节点遍历了整个链表 ？ 还是我理解错了？
                            for (Node<K,V> p = f.next; p != null; p = p.next) {
                                int b = p.hash & n;
                                if (b != runBit) {
                                    runBit = b;
                                    lastRun = p;
                                }
                            }

                            // 分链表的准备工作
                            if (runBit == 0) {
                                ln = lastRun;
                                hn = null;
                            } else {
                                hn = lastRun;
                                ln = null;
                            }
                            // 最后还是按照那个特定的位是 0/1 分成两个链表
                            // 链表的顺序翻转了，采用的头插
                            for (Node<K,V> p = f; p != lastRun; p = p.next) {
                                int ph = p.hash;
                                K pk = p.key;
                                V pv = p.val;
                                if ((ph & n) == 0)
                                    ln = new Node<K,V>(ph,pk,pv,ln);
                                else
                                    hn = new Node<K,hn);
                            }
                            // 链表放在新数组的位置 i 和 i+n
                            setTabAt(nextTab,ln);
                            setTabAt(nextTab,i + n,hn);
                            // 将原数组该位置处设置为 fwd，代表该位置已经处理完毕，
                            // 其他线程一旦看到该位置的 hash 值为 MOVED，就不会进行迁移了
                            setTabAt(tab,fwd);
                            // advance 设置为 true，代表该位置已经迁移完毕
                            advance = true;
                        } else if (f instanceof TreeBin) {
                            // 红黑树的迁移
                            TreeBin<K,V> t = (TreeBin<K,V>) f;
                            TreeNode<K,V> lo = null,loTail = null;
                            TreeNode<K,V> hi = null,hiTail = null;
                            int lc = 0,hc = 0;
                            for (Node<K,V> e = t.first; e != null; e = e.next) {
                                int h = e.hash;
                                TreeNode<K,V> p = new TreeNode<K,V>
                                        (h,null);
                                if ((h & n) == 0) {
                                    if ((p.prev = loTail) == null)
                                        lo = p;
                                    else
                                        loTail.next = p;
                                    loTail = p;
                                    ++lc;
                                } else {
                                    if ((p.prev = hiTail) == null)
                                        hi = p;
                                    else
                                        hiTail.next = p;
                                    hiTail = p;
                                    ++hc;
                                }
                            }

                            // 如果一分为二后，节点数少于 8，那么将红黑树转换回链表
                            ln = (lc <= UNTREEIFY_THRESHOLD) ? untreeify(lo) :
                                    (hc != 0) ? new TreeBin<K,V>(lo) : t;
                            hn = (hc <= UNTREEIFY_THRESHOLD) ? untreeify(hi) :
                                    (lc != 0) ? new TreeBin<K,V>(hi) : t;
                            // 处理同链表
                            setTabAt(nextTab,hn);
                            setTabAt(tab,fwd);
                            advance = true;
                        }
                    }
                }
            }
        }
    }

this method is mainly used to let other threads happen to encounter when they operate on the table. The table is expanding and data is migrated. Let these threads help complete the data migration, that is, to get the task package of data migration in transfer.

//1. putVal 当有 MOVED 节点的时候  2.replaceNode 也就是在 remove 中 当有 MOVED 节点的时候
    //3. clear 4.conpute/IfPresent 干嘛的？ 5.merge 干嘛的？
    // 好了发现规律了就是当我们在有遍历节点的操作的时候遇到了 MOVED 就去 helpTransfer 也就是 transfer
    final Node<K,V>[] helpTransfer(Node<K,V> f) {
        Node<K,V>[] nextTab;
        int sc;
        // 条件判断 + 把正在 transfer 的 nextTab 赋值给 nextTab
        if (tab != null && (f instanceof ForwardingNode) && (nextTab = ((ForwardingNode<K,V>) f).nextTable) != null) {
            int rs = resizeStamp(tab.length);
            // 当 transfer 还没有结束  sizeCtl < 0 表示在迁移
            while (nextTab == nextTable && table == tab && (sc = sizeCtl) < 0) {
                if ((sc >>> RESIZE_STAMP_SHIFT) != rs || sc == rs + 1 ||
                        sc == rs + MAX_RESIZERS || transferIndex <= 0)
                    break;
                // 把 SIZECTL 加一 然后 transfer
                if (U.compareAndSwapInt(this,sc + 1)) {
                    transfer(tab,nextTab);
                    break;
                }
            }
            return nextTab;
        }
        return table;
    }

11. tableAt/casTableAt/setTableAt

these methods are mainly used to encapsulate some CAS operations on tables. It's just used more.

11. remove/replaceNode、

the remove method also calls the replacenode method, which is a conventional idea. If you see that a node is transferring, call helptransfer. Because you need to operate, the node needs to be synchronized. Delete the tree and linked list.

final V replaceNode(Object key,Object cv) {
        int hash = spread(key.hashCode());
        for (Node<K,fh;
            if (tab == null || (n = tab.length) == 0 || (f = tabAt(tab,i = (n - 1) & hash)) == null)
                break;
            // 有正在移动的节点就帮助移动
            else if ((fh = f.hash) == MOVED)
                tab = helpTransfer(tab,f);
            else {
                V oldVal = null;
                boolean validated = false;
                synchronized (f) {
                    if (tabAt(tab,i) == f) {
                        // 链表节点
                        if (fh >= 0) {
                            validated = true;
                            for (Node<K,V> e = f,pred = null; ; ) {
                                K ek;
                                if (e.hash == hash && ((ek = e.key) == key || (ek != null && key.equals(ek)))) {
                                    V ev = e.val;
                                    if (cv == null || cv == ev || (ev != null && cv.equals(ev))) {
                                        oldVal = ev;
                                        //value不为空，则更新值
                                        if (value != null)
                                            e.val = value;
                                        //value为空，则删除此节点
                                        else if (pred != null)
                                            pred.next = e.next;
                                        //符合条件的节点e为头结点的情况
                                        else
                                            setTabAt(tab,e.next);
                                    }
                                    break;
                                }
                                pred = e;
                                if ((e = e.next) == null)
                                    break;
                            }
                        //  红黑树
                        } else if (f instanceof TreeBin) {
                            validated = true;
                            TreeBin<K,V> r,p;
                            if ((r = t.root) != null &&
                                    (p = r.findTreeNode(hash,null)) != null) {
                                V pv = p.val;
                                if (cv == null || cv == pv ||
                                        (pv != null && cv.equals(pv))) {
                                    oldVal = pv;
                                    if (value != null)
                                        p.val = value;
                                    else if (t.removeTreeNode(p))
                                        setTabAt(tab,untreeify(t.first));
                                }
                            }
                        }
                    }
                }
                // 进行了节点的删除  cas 更新 Count
                if (validated) {
                    if (oldVal != null) {
                        if (value == null)
                            addCount(-1L,-1);
                        return oldVal;
                    }
                    break;
                }
            }
        }
        return null;
    }

12. clear

The emptying operation should also be mentioned, because it is no longer the same as before, and all elements are directly set to null for garbage collection. Because this involves the size statistics of the table and other threads operating the table at the same time, it may lead to confusion. It counts all nodes under an element and then modifies the size of the container, that is, it recycles each node under each array. Obviously, the recycling node needs to be synchronized. Of course, because it involves modifying tables, you also need to help with data migration if you encounter capacity expansion.

public void clear() {
        // 这个变量记录了要删除的节点数的负数
        long delta = 0L;
        int i = 0;
        Node<K,V>[] tab = table;
        while (tab != null && i < tab.length) {
            int fh;
            Node<K,V> f = tabAt(tab,i);
            // 空节点 进行下一趟
            if (f == null)
                ++i;
            // 帮助移动
            else if ((fh = f.hash) == MOVED) {
                tab = helpTransfer(tab,f);
                i = 0; // restart
            //  删除节点
            } else {
                synchronized (f) {
                    if (tabAt(tab,i) == f) {
                        // 如果是链表节点就返回链表头
                        // 如果是树节点就返回他的下一个 否则返回 null
                        Node<K,V> p = (fh >= 0 ? f : (f instanceof TreeBin) ? ((TreeBin<K,V>) f).first : null);
                        // 遍历链表或者树
                        while (p != null) {
                            --delta;
                            p = p.next;
                        }
                        // 直接把数组对应的位置设置为 null
                        setTabAt(tab,i++,null);
                    }
                }
            }
        }
        if (delta != 0L)
            addCount(delta,-1);
    }

13. read/writeObject

his serialization is also implemented by himself. Only key and value are stored. The whole table is rebuilt during deserialization.

3. Summary

this class is really troublesome, but the overall idea is really similar to that of HashMap. However, due to the need to achieve high concurrency, access security has done a lot of more detailed operations. What can be done in one step in the original HashMap is divided into several steps, and the concurrency is higher than that in version 1.7, because only one element is added when locking, 1.7 adds a segment lock, which has smaller lock granularity and higher concurrency. And the task subcontracting in the transfer is really ingenious to accelerate capacity expansion. In fact, writing a text description is only a general process. I wrote a lot of comments on the source code. It will be much easier to read the source code after reading the description, and many details of the source code are not mentioned in the description, so you must read the source code carefully after understanding the general idea.

The content of this article comes from the network collection of netizens. It is used as a learning reference. The copyright belongs to the original author.

THE END

Java

二维码

Java container

< <上一篇

CDN 2021 full attack Guide (2)

下一篇>>

搜索内容

Source code analysis of concurrenthashmap

Source code analysis of concurrenthashmap

1. Preface

2. Basic structure

1. Succession

2. Realization

3. Properties

1. Basic attributes

2. Node

3. Segment

4. TreeNode

5. TreeBin

4. Overview of main methods

1. ctor-4

2. size/sumCount

3. get

4. Containskey / value (traverser object)

5. put/putVal

6. initTable

7. treeifyBin

8. tryPresize

9. transfer

11. tableAt/casTableAt/setTableAt

11. remove/replaceNode、

12. clear

13. read/writeObject

3. Summary

热门文章