Traversal operation of ArrayList and LinkedList
summary
A well-known little knowledge of a java program is that ArrayList and LinkedList are best deleted by iterator rather than traversal.
When we try to delete using the for loop or foreach, some unexpected situations often occur, resulting in the failure of deleting all collections. In this regard, I have always been in a state of knowing what is and why. I just finished reading the source code of ArrayList and LinkedList recently. Today's article summarizes several error deletions of ArrayList and LinkedList in combination with the source code.
1、 Fast fail mechanism of list collection
Before we begin, we need to understand the fast fail mechanism of collections.
The list interface has an abstractlist abstract class, and all implementation classes under the list inherit it directly or indirectly.
Among its member variables, there is a variable called modcount. When the implementation class performs structural operations - generally refers to operations that will affect the underlying data structure, such as deletion - it will be + 1.
When each iterator is created, the member variable expectedmodcount assigned to the iterator by the current modcount will be obtained from the outside, and then each time the iterator's next () method or other addition and deletion methods are called, it will compare whether the modcount and expectedmodcount are equal. If not, it will throw a concurrentmodificationexception.
This concurrent modification check can quickly throw exceptions when problems occur, so as to avoid possible wrong data from entering subsequent operations. This is also the source of most concurrentmodificationexception exceptions in collection operations.
2、 For circular deletion of ArrayList
The remove() of ArrayList can be deleted by subscript or by element. The latter must traverse the collection every time, which is very inefficient. Therefore, only the former, that is, the method of deleting by subscript, is discussed here.
1. Examples
We know that the underlying implementation of ArrayList is array, and it implements the interface of randomaccess. Therefore, it is officially recommended to use the for loop to operate and traverse the collection. However, when we use the for + subscript to delete the elements in the ArrayList, the problem of "missing deletion" will occur.
Let's simulate:
ArrayList<String> list = new ArrayList<>(Arrays.asList("a","b","c","d"));
for (int i = 0; i < list.size(); i++) {
list.remove(i);
}
System.out.println(list); // [b,d]
So, B and D are skipped.
2. Reasons
We can look at the source code of the remove () method in ArrayList:
public boolean remove(Object o) {
if (o == null) {
for (int index = 0; index < size; index++)
if (elementData[index] == null) {
fastRemove(index);
return true;
}
} else {
for (int index = 0; index < size; index++)
if (o.equals(elementData[index])) {
fastRemove(index);
return true;
}
}
return false;
}
We can see that he called fastremove():
private void fastRemove(int index) {
// modCount++
modCount++;
int numMoved = size - index - 1;
// 是否需要移动数组
if (numMoved > 0)
// 拷贝并且移动数组
System.arraycopy(elementData,index+1,elementData,index,numMoved);
elementData[--size] = null;
}
In the fastremove () method, the system. Array copy method is called Arraycopy(), which moves the whole array forward one bit after deleting the position.
Let's restore the deletion process:
Simply put, if I delete the element with index = a, the element with index = a + 1 will run to the position with index = A. when we start the next cycle, we think that the element with indedx = a + 1 is deleted. In fact, the element with index = a + 2 has "offset", which is the reason for "missing deletion".
3. Solutions
There are two ways to avoid this situation:
The first way is to let the index before even one operation--
int size = list.size();
for (int i = 0,j = 0; i < size; i++,j++) {
list.remove(j);
if (j % 2 == 0) {
j--;
}
}
System.out.println(list); // []
In fact, this idea is also the remove () idea of the iterator in ArrayList, but the code written with the for loop is very cumbersome and not easy to understand.
The second method is to delete in reverse order
Let's go back to fastremove(), and we'll see this Code:
int numMoved = size - index - 1;
// 是否需要移动数组
if (numMoved > 0) {
... ...
}
In fact, nummoved = szie - 1 - index determines whether to move the array, that is, as long as the index passed in is greater than or equal to size, it will not cause subscripts, so we can delete it in reverse order:
for (int i = list.size() - 1; i >= 0; i--) {
list.remove(i);
}
System.out.println(list); // []
3、 Foreach deletion of ArrayList
1. Examples
Let's talk about the problem first. ArrayList throws an exception when using foreach() to delete circularly:
ArrayList<String> list = new ArrayList<>(Arrays.asList("a","d"));
try {
list.forEach(list::remove);
} catch (ConcurrentModificationException e) {
System.out.println(list); // [b,c,d]
e.printStackTrace(); // ConcurrentModificationException
}
This code throws an exception when deleting the second element.
2. Reasons
The foreach method of ArrayList comes from the parent interface Iterable of collection. The default display mode of Iterable is to enhance the for loop, and ArrayList overrides this method:
public void forEach(Consumer<? super E> action) {
Objects.requireNonNull(action);
// 获取当前 modCount
final int expectedModCount = modCount;
@SuppressWarnings("unchecked")
final E[] elementData = (E[]) this.elementData;
final int size = this.size;
// 每次循环开始前检查 modCount
for (int i=0; modCount == expectedModCount && i < size; i++) {
action.accept(elementData[i]);
}
if (modCount != expectedModCount) {
throw new ConcurrentModificationException();
}
}
At the beginning of foreach(), expectedmodcount is used to record the modcount at the beginning of the method, and then modcount = = expectedmodcount will be judged at the end of each cycle. Looking back at the remove() method of ArrayList, we can see that modcount + + has been used since fastremove(), so it is not difficult to infer the whole process:
In other words, not only delete, but also all methods that will increase modcount can not be used in foreach().
We can use the add () test to see:
ArrayList<String> list = new ArrayList<>(Arrays.asList("a","d"));
try {
list.forEach(list::add); // [a,b,d,a]
} catch (ConcurrentModificationException e) {
System.out.println(list);
e.printStackTrace(); // ConcurrentModificationException
}
4、 Iterator deletion of ArrayList
There is no problem with iterator deletion, but if iterator is invoked in iterator iteration, there will be a problem.
ArrayList<String> list = new ArrayList<>(Arrays.asList("a","d"));
try {
Iterator<String> iterator = list.iterator();
while (iterator.hasNext()) {
String s = (String) iterator.next();
// 使用非iterator的方法删除
list.remove(s);
}
} catch (ConcurrentModificationException e) {
System.out.println(list); // [b,d]
e.printStackTrace(); // ConcurrentModificationException
}
You can see the throwing exception, but put the list Replace remove() with iterator Remove () is OK.
We can look at the iterator Source code of remove():
public void remove() {
if (lastRet < 0)
throw new IllegalStateException();
checkForComodification();
try {
// 调用外部的remove方法
ArrayList.this.remove(lastRet);
cursor = lastRet;
lastRet = -1;
// 获取最新的 modCount
expectedModCount = modCount;
} catch (indexoutofboundsexception ex) {
throw new ConcurrentModificationException();
}
}
Obviously, compared with directly calling the external remove(), the remove() inside the iterator updates the expectedmodcount after calling the external remove(). This expectedmodcount is a member variable inside the iterator. When the construction method is executed, the modcount is obtained from the outside and assigned to it, Each time the next () method of the iterator is called, expectedmodcount and modcount are compared. If they are not equal, exceptions will be thrown.
So far, the problem is clear. When we do not use the remove () inside the iterator to delete the node, the modcount is updated, but the expectedmodcount. Therefore, a concurrentmodificationexception will be thrown when iterating over the second element.
In other words, just like forEach (), it is not just remove () that causes such problems. During iterator iteration, calling any external way that causes modCount to change will throw it out.
5、 For loop deletion of LinkedList
For circular deletion of LinkedList will also lead to "missing deletion"
LinkedList<String> list = new LinkedList<>(Arrays.asList("a",d]
The same reason as the for circular deletion error of ArrayList is that the index is "offset". However, unlike ArrayList, because the underlying implementation of LinkedList is a linked list, instead of using the arraycopy () method, it directly removes the reference relationship between the front and rear nodes:
public E remove(int index) {
checkElementIndex(index);
return unlink(node(index));
}
E unlink(Node<E> x) {
// assert x != null;
final E element = x.item;
final Node<E> next = x.next;
final Node<E> prev = x.prev;
if (prev == null) {
first = next;
} else {
prev.next = next;
x.prev = null;
}
if (next == null) {
last = prev;
} else {
next.prev = prev;
x.next = null;
}
x.item = null;
size--;
modCount++;
return element;
}
The solution is the same as ArrayList. As long as the index "offset" in the loop is avoided, the two methods of manual index -- and reverse order deletion in ArrayList are also applicable to him.
6、 For each deletion of LinkedList
Foreach () in ArrayList rewrites the foreach () method of Iterable interface, but it is not rewritten in LinkedList, so the foreach () of LinkedList still uses the implementation provided in Iterable interface:
default void forEach(Consumer<? super T> action) {
Objects.requireNonNull(action);
for (T t : this) {
action.accept(t);
}
}
When we use the enhanced for loop to traverse the array, the general for loop with for + subscript will be obtained after the final compilation, and the traversal set will be compiled into an iterator version of the loop. Therefore:
// forEach
list.forEach(list::remove);
// 增强for
for (T t : list) {
list.remove(t);
}
// 迭代器
Iterator<T> iterator = list.listIterator();
while (iterator.hasNext()) {
list.remove(iterator.next());
}
The above three expressions are equivalent. In LinkedList, foreach traversal and iterator traversal are equivalent, and the former is the iterator used in the end. In fact, when we see the list in the iterator Remove () should understand why LinkedList's foreach () throws exceptions.
Just like the iterator deletion of ArrayList, because the external remove() is called, the modcount is changed, but the expectedmodcount is not changed. Therefore, when next() is called, a concurrentmodificationexception will be thrown because expectedmodcount = modcount.
7、 Summary
Why is a concurrentmodificationexception thrown sometimes?
There is a concurrent modification checking mechanism in the list collection. Abstractlist provides a modcount field. When using structural operation methods such as add () or remove (), modcount will be + 1. When the iterator of the list implementation class is created, it will use the member variable expectedmodcount to record the current modcount. Each time next() is called, it will check whether the latest modcount is equal to expectedmodcount. Otherwise, it will throw a concurrentmodificationexception exception.
Therefore, expectedmodcount will be updated synchronously only by calling the method provided inside the iterator, otherwise only modcount will be updated. Therefore, the addition and deletion of ArrayList and LinkedList during iterator iteration will throw exceptions.
ArrayList rewrites the foreach () method from enhanced for to ordinary for loop, but modcount is also recorded at the beginning of the method. Each loop will be compared, so exceptions will be thrown because modcount is changed in the loop.
LinkedList does not override the foreach () method. The underlying layer still uses enhanced for. After compilation, it is still an iterator. Therefore, the reason for throwing exceptions is the same as the operation in the iterator.
Why does an ordinary for circular deletion "Miss deletion"?
The bottom layer of ArrayList deletion is to use the arraycopy method to generate a new array. All elements after the deleted node on the new array will move forward one bit, resulting in the "offset" of the index. Therefore, if a is deleted, the elements of a + 1 will be moved to the position of a. the next deletion of a + 1 is actually a + 2, so a + 1 will be skipped.
LinkedList is a linked list, but deleting a node will also cause the latter node to "fill in" the position corresponding to the subscript of the deleted node. Therefore, there will also be "missed deletion" due to the index "offset".
There are two solutions: one is to keep the index pointing to the current position after deleting the elements, and the other is to delete in reverse order.
In fact, if you add elements, there will be problems. Although you can add them successfully, they will not be inserted in the specified order. This is also because of the above reasons.