Java collection source code analysis (II): list and abstractlist

2020-12-03 • Java

summary

@R_ 301_ 2444@ this article basically introduces the upper structure of the list interface, that is, the iteratable interface, the collection interface and the abstract class that implements the collection interface. Now, on the basis of the above, we will continue to move forward to the implementation and further explore the source code of the list interface and its abstract implementation class abstractlist, Understand how it connects the preceding and the following between the three implementation classes and the collection interface.

1、 List interface

The list interface inherits the collection interface and adds some methods on the basis of the collection interface. Compared with the collection interface, we can clearly see that there are many methods to operate the collection according to the subscript in the list. We can simply and roughly distinguish whether the abstract method of a method comes from the collection or the list: if there is a subscript in the parameter, it comes from the list, and if there is no subscript, it comes from the collection.

It can be said that on the basis of collection, the list interface further defines the feature that the list collection allows fast access according to subscripts.

1. New method

2. New method with the same name

3. Rewriting method

2、 Abstractlist abstract class

Abstractlist class is an abstract class that inherits the abstractcollection class and implements the list interface. It is equivalent to the second layer method template after abstractcollection. It is a preliminary implementation of the list interface and a further implementation of the collection.

1. Unsupported implementation

Set(), add(), remove() that can be directly operated by subscript are all new interfaces introduced by list, which are not supported by abstractlist, and must be overridden by subclasses.

public E set(int index,E element) {
    throw new UnsupportedOperationException();
}
public void add(int index,E element) {
    throw new UnsupportedOperationException();
}
public E remove(int index) {
    throw new UnsupportedOperationException();
}

2. Internal classes

Unlike the abstractcollection class, abstractlist has several special internal classes. Their iterator classes are ITR and listitr. The corresponding methods to obtain them are:

View class sublist and randomaccesssublist:

These internal classes are also dependent on some other methods, so to fully understand the implementation of abstractlist method, you need to understand the role and implementation principle of these internal classes.

3、 Sublist method and inner class

Sublist () is a common method. In the list interface, this method should return a view of a part of the current collection:

public List<E> subList(int fromIndex,int toIndex) {
    // 是否是实现了RandomAccess接口的类
    return (this instanceof RandomAccess ?
            // 是就返回一个可以随机访问的内部类RandomAccessSubList
            new RandomAccessSubList<>(this,fromIndex,toIndex) :
            // 否则返回一个普通内部类SubList
            new SubList<>(this,toIndex));
}

Randomaccesssublist and sublist are involved here. Randomaccesssublist class is a subclass of sublist class, but implements randomaccess interface.

1. Sublist inner class

We can simply understand sublist and abstractlist as an implementation of decorator pattern, just like the implementation classes of synchronized list and list interface. The sublist inner class encapsulates the abstractlist method again, transforming the operation of abstractlist into "view operation".

Let's first look at the member variables and construction methods of the sublist class:

class SubList<E> extends AbstractList<E> {
    // 把外部类AbstractList作为成员变量
    private final AbstractList<E> l;
    // 表示视图的起始位置（偏移量）
    private final int offset;
    // SubList视图的长度
    private int size;

    SubList(AbstractList<E> list,int fromIndex,int toIndex) {
        if (fromIndex < 0)
            throw new indexoutofboundsexception("fromIndex = " + fromIndex);
        if (toIndex > list.size())
            throw new indexoutofboundsexception("toIndex = " + toIndex);
        if (fromIndex > toIndex)
            throw new IllegalArgumentException("fromIndex(" + fromIndex +
                                               ") > toIndex(" + toIndex + ")");
        // 获取外部类的引用
        // 这也是为什么操作视图或者外部类都会影响对方的原因，因为都操作内存中的同一个实例
        l = list;
        // 获取当前视图在外部类中的起始下标
        offset = fromIndex;
        // 当前视图的长度就是外部类截取的视图长度
        size = toIndex - fromIndex;
        this.modCount = l.modCount;
    }
    
}

We can refer to the picture to understand:

Then the methods in the sublist are well understood:

public E set(int index,E element) {
    // 检查下标是否越界
    rangeCheck(index);
    // 判断是存在并发修改
    checkForComodification();
    // 把元素添加到偏移量+视图下标的位置
    return l.set(index+offset,element);
}

Other methods are similar, so we don't spend more time here.

2. Randomaccesssublist internal class

Then comes randomaccesssublist, a subclass of sublist:

class RandomAccessSubList<E> extends SubList<E> implements RandomAccess {
    RandomAccessSubList(AbstractList<E> list,int toIndex) {
        super(list,toIndex);
    }

    public List<E> subList(int fromIndex,int toIndex) {
        return new RandomAccessSubList<>(this,toIndex);
    }
}

We can see that it is actually a sublist, but it implements the randomaccess interface. In fact, this interface is just a tag. The classes that implement this interface can achieve fast random access (subscript). The value obtained through the for loop + subscript will be faster than using the iterator.

Both vector and ArrayList implement this interface, while LinkedList does not. The purpose of this implementation is to distinguish the three when implementing the sublist () method called by the class.

4、 Iterator method and inner class

In abstractlist, ITR and listitr iterators are provided for us.

Iterator is a very important part of abstractlist. It is the implementation of the iterator () method in the iteratable interface, the top-level interface of the whole interface system. Many methods involving traversal in the source code are inseparable from the iterator class implemented internally.

1. Fast fail mechanism of iterator

We know that abstractlist does not provide thread safety guarantee by default, but in order to avoid the impact of concurrent modification on iteration as much as possible, JDK introduces a fast fail mechanism, that is, if the detected concurrent modification occurs, an exception will be thrown immediately, rather than allowing the parameters that can make mistakes to be used, resulting in unpredictable errors.

In this regard, abstractlist provides a member variable modcount. Javadoc describes it as follows:

At this time, let's go back and look at part of the code of iterator class ITR. You can see:

private class Itr implements Iterator<E> {
    // 迭代器认为后备列表应该具有的modCount值。如果违反了此期望，则迭代器已检测到并发修改。
    int expectedModCount = modCount;
    
    // 检查是否发生并发操作
    final void checkForComodification() {
        if (modCount != expectedModCount)
            throw new ConcurrentModificationException();
    }
}

Combined with the code, it is not difficult for us to understand how the fast fail mechanism is implemented:

Abstractlist provides a member variable to record the number of structural modifications to the collection. If the subclass wants to check concurrent modification errors, it needs to make modcount + 1 in the method of structural operation. So. After obtaining the iterator, the iterator will obtain the current modcount and assign it to expectedmodcount.

When iterating with iterators, each iteration will detect whether modcount and expectedmodcount are equal. If they are not equal, it means that after the iterator is created, the collection structure has been modified. If you iterate again at this time, there may be errors (such as traversing one less and one more). Therefore, after detection, you will directly throw a concurrentmodificationexception.

Listitr inherits ITR, so they all have the same fast fail mechanism.

It is worth mentioning that for implementation classes with fast fail mechanism enabled, only iterators can be used to traverse and delete at the same time. The reason is also because of concurrent modification detection:

2. ITR iterator

Now, back to ITR's code:

private class Itr implements Iterator<E> {
    // 后续调用next返回的元素索引
    int cursor = 0;

    // 最近一次调用返回的元素的索引。如果通过调用remove删除了此元素，则重置为-1。
    int lastRet = -1;

    // 迭代器认为后备列表应该具有的modCount值。如果违反了此期望，则迭代器已检测到并发修改。
    int expectedModCount = modCount;
	
    public boolean hasNext() {
        return cursor != size();
    }

    public E next() {
        checkForComodification();
        try {
            int i = cursor;
            E next = get(i);
            lastRet = i;
            cursor = i + 1;
            return next;
        } catch (indexoutofboundsexception e) {
            checkForComodification();
            throw new NoSuchElementException();
        }
    }

    public void remove() {
        if (lastRet < 0)
            throw new IllegalStateException();
        checkForComodification();

        try {
            AbstractList.this.remove(lastRet);
            if (lastRet < cursor)
                cursor--;
            lastRet = -1;
            expectedModCount = modCount;
        } catch (indexoutofboundsexception e) {
            throw new ConcurrentModificationException();
        }
    }
	
    final void checkForComodification() {
        if (modCount != expectedModCount)
            throw new ConcurrentModificationException();
    }
}

Iterative method

In addition to concurrent change detection, iterators iterate in unexpected ways. Let's look at the hasnext () method:

public E next() {
    // 检验是否发生并发修改
    checkForComodification();
    try {
        int i = cursor;
        E next = get(i);
        lastRet = i;
        cursor = i + 1;
        return next;
    } catch (indexoutofboundsexception e) {
        checkForComodification();
        throw new NoSuchElementException();
    }
}

This logic is actually the same as the traversal of the linked list, except that the pointer becomes the subscript of the array. Understand in the form of linked list:

We call the node after calling next () in the loop the next node, or the current node anyway. Suppose there are now three elements a, B and C:

Now that we know the iterative method and the functions of cursor and lastret, it is not difficult to understand the remove () method:

public void remove() {
    if (lastRet < 0)
        throw new IllegalStateException();
    checkForComodification();

    try {
        // 调用删除方法
        AbstractList.this.remove(lastRet);
        if (lastRet < cursor)
		   // 因为删除了当前第i个节点，所以i+1个节点就会变成第i个节点，
            // 调用next()以后cursor会+1，因此如果不让cursor-1,就会,next()以后跳过原本的第i+1个节点
            // 拿上面的例子来说，你要删除abc，但是在删除a以后会跳过b直接删除c
            cursor--;
        // 最近一个操作的节点被删除了，故重置为-1
        lastRet = -1;
        // 因为调用了外部类的remove方法，所以会改变modCount值，迭代器里也要获取最新的modCount
        expectedModCount = modCount;
    } catch (indexoutofboundsexception e) {
        throw new ConcurrentModificationException();
    }
}

As for the hasnext () method, there is nothing to say. If the cursor is as long as the length of the set, it means that it has been iterated to the end.

2. Listitr iterator

Listitr inherits the ITR class and implements the listiterator interface. The listiterator interface inherits the iterator interface. Their class diagram is as follows:

Listiterator interface mainly provides six new abstract methods based on the iterator interface:

It can be seen that the listitr class that implements listiterator is more powerful than ITR. It can not only iterate backward, but also iterate forward. It can also update or add nodes during the iteration process.

private class ListItr extends Itr implements ListIterator<E> {
    // 可以自己设置迭代的开始位置
    ListItr(int index) {
        cursor = index;
    }
	
    // 下一节点是否就是第一个节点
    public boolean hasPrevIoUs() {
        return cursor != 0;
    }

    public E prevIoUs() {
        // 检查并发修改
        checkForComodification();
        try {
            // 让游标指向当前节点
            int i = cursor - 1;
            // 使用AbstractList的get方法获取当前节点
            E prevIoUs = get(i);
            lastRet = cursor = i;
            return prevIoUs;
        } catch (indexoutofboundsexception e) {
            checkForComodification();
            throw new NoSuchElementException();
        }
    }
	
    // 获取下一节点的下标
    public int nextIndex() {
        return cursor;
    }

    // 获取当前节点（下一个节点的上一个节点）的下标
    public int prevIoUsIndex() {
        return cursor-1;
    }

    public void set(E e) {
        if (lastRet < 0)
            throw new IllegalStateException();
        checkForComodification();

        try {
            AbstractList.this.set(lastRet,e);
            expectedModCount = modCount;
        } catch (indexoutofboundsexception ex) {
            throw new ConcurrentModificationException();
        }
    }

    public void add(E e) {
        checkForComodification();

        try {
            int i = cursor;
            // 往下一个节点的位置添加新节点
            AbstractList.this.add(i,e);
            lastRet = -1;
            cursor = i + 1;
            expectedModCount = modCount;
        } catch (indexoutofboundsexception ex) {
            throw new ConcurrentModificationException();
        }
    }
}

What is difficult to understand here is the concept of the next node and the current node. In fact, it can be understood as follows: the cursor cursor must specify the node to be obtained in the next () operation. Therefore, the cursor must point to the next node before or after the operation. Therefore, relative to the next node, the cursor is actually the current node, It is the previous node relative to the next node.

In other words, if there are three elements a, B and C, the current cursor is 2, that is, it points to B. After calling next (), the cursor will point to C, and after calling previous (), the cursor will point back to B.

As for lastret, the member variable is only used to record the node of the latest operation, which has nothing to do with directionality.

5、 Abstractlist implementation method

1.add

Note that add (int index, e, e) of abstractlist is still not supported. Add (E, e) only defines the logic of adding elements to the end of the queue through add (int index, e, e).

// 不指定下标的add，默认逻辑为添加到队尾
public boolean add(E e) {
    add(size(),e);
    return true;
}

The relationship between the add() method in abstractlist and abstractcollection is as follows:

@H_ 297_ 403@

The add (E) in abstractlist has the feeling of "abstract class specifies algorithm skeleton" mentioned in the template pattern. The abstractcollection interface provides a preliminary implementation of add (E, e) (although it only throws exceptions), and then improves the logic of the add (E, e) method in the abstractlist - insert elements at the end of the queue by calling the add (int index, e, e) method, but how to implement the specific add (int index, e, e) is determined by the subclass.

2.indexOf/LastIndexOf

public int indexOf(Object o) {
    ListIterator<E> it = listIterator();
    if (o==null) {
        while (it.hasNext())
            if (it.next()==null)
                return it.prevIoUsIndex();
    } else {
        while (it.hasNext())
            if (o.equals(it.next()))
                return it.prevIoUsIndex();
    }
    return -1;
}

public int lastIndexOf(Object o) {
    ListIterator<E> it = listIterator(size());
    if (o==null) {
        while (it.hasPrevIoUs())
            if (it.prevIoUs()==null)
                return it.nextIndex();
    } else {
        while (it.hasPrevIoUs())
            if (o.equals(it.prevIoUs()))
                return it.nextIndex();
    }
    return -1;
}

3.addAll

Addall here comes from addall of the list set. The parameter is the set to be merged and the starting subscript:

public boolean addAll(int index,Collection<? extends E> c) {
    rangeCheckForAdd(index);
    boolean modified = false;
    for (E e : c) {
        add(index++,e);
        modified = true;
    }
    return modified;
}

The rangecheckforadd() method here is a method to check whether the subscript is out of bounds:

private void rangeCheckForAdd(int index) {
    // 不得小于0或者大于集合长度
    if (index < 0 || index > size())
        throw new indexoutofboundsexception(outOfBoundsMsg(index));
}

4.removeRange

This method is a private method of abstractlist. It is generally used by subclasses to delete a section of multiple elements. It is implemented with the help of listiter iterator.

protected void removeRange(int fromIndex,int toIndex) {
    ListIterator<E> it = listIterator(fromIndex);
    // 从fromIndex的下一个开始，删到toIndex
    for (int i=0,n=toIndex-fromIndex; i<n; i++) {
        it.next();
        it.remove();
    }
}

6、 Method overridden by abstractlist

1.equals

The equals () method is special. It is an abstract method from the collection and list interfaces. It is implemented in abstractlist, but it is actually a rewriting of the methods in object. Considering the special case of equals (), we also consider it an overridden method.

Let's first look at what Javadoc says:

Then look at the source code:

public boolean equals(Object o) {
    // 是否同一个集合
    if (o == this)
        return true;
    // 是否实现了List接口
    if (!(o instanceof List))
        return false;
	
    // 获取集合的迭代器并同时遍历
    ListIterator<E> e1 = listIterator();
    ListIterator<?> e2 = ((List<?>) o).listIterator();
    while (e1.hasNext() && e2.hasNext()) {
        E o1 = e1.next();
        Object o2 = e2.next();
        // 两个集合中的元素是否相等
        if (!(o1==null ? o2==null : o1.equals(o2)))
            return false;
    }
    // 是否两个集合长度相同
    return !(e1.hasNext() || e2.hasNext());
}

It can also be seen from the source code that the equals () of abstractlist requires two sets to be absolutely equal: the order is equal, and the elements in the same position should be equal.

2.hashCode

Hashcode() and equals() are the same. Abstractlist redefines hashcode()

public int hashCode() {
    int hashCode = 1;
    for (E e : this)
        hashCode = 31*hashCode + (e==null ? 0 : e.hashCode());
    return hashCode;
}

The new calculation method will obtain the hashcode of each element in the set to calculate the hashcode of the set. This may be because in the original case, the same set will obtain the same hashcode even if the loaded elements are different, which may cause unnecessary trouble. Therefore, the secondary method is rewritten.

We can write a test to see:

List<String> list1 = new ArrayList<>();
list1.add("a");
System.out.println(list1.hashCode()); // 128
list1.add("c");
System.out.println(list1.hashCode()); // 4067

7、 Summary

The list interface inherits from the collection interface. The characteristics of the new methods are mainly reflected in the fact that nodes can be operated through subscripts. It can be said that most methods with subscripts as parameters are added to the list.

Abstractlist is an abstract class that implements list. It implements most of the methods in the list interface. At the same time, it inherits abstractcollection and follows some implementations in abstractcollection. These two abstract classes can be regarded as an embodiment of the template method pattern.

It provides an empty implementation of the subscript version of add(), remove(), set().

Abstractlist internally provides two iterators, ITR and listitr. ITR implements the iterator interface and implements the basic iterative deletion, while listitr implements the listiterator. On the basis of the former, it adds the relevant methods of adding modifications in the iteration and reverse iteration, and can create the iterator from the specified position.

The sublist of abstractlist can be regarded as the wrapper class of abstractlist. When instantiating, it will assign the reference of the external class instance to the member variable. The operation method with the same name still calls abstractlist, but the subscript based call will add step length on the basis of the default parameters to achieve a feeling similar to "view".

Abstractlist introduces the mechanism of fast fail under concurrent modification. Internally, it maintains a member variable modelcount, which is zero by default. Each structural modification will make it + 1. During the iteration, it will check whether the modelcount meets the expected value by default, otherwise an exception will be thrown. It is worth noting that this requires the cooperation of the implementation class. When implementing methods such as add (), make modelcount + 1. For some implementation classes, deleting them in the iteration may throw concurrentmodificationexceptions, which is the problem in this regard.

Abstractlist rewrites the hashcode () method. Instead of directly obtaining the hashcode value of the instance, it traverses the collection and calculates the hashcode of the collection according to the hashcode of each element. This ensures that the same collection with different contents will not get the same hashcode.

The content of this article comes from the network collection of netizens. It is used as a learning reference. The copyright belongs to the original author.

THE END

Java

二维码

Array of java series

< <上一篇

Android DNS 修改

下一篇>>

搜索内容