[introduction to Java] day28 detailed explanation of Java container class (x) detailed explanation of LinkedHashMap
Today, let's introduce LinkedHashMap, another hash table in the container class. This is the closing disciple of HashMap, who directly inherits the mantle of HashMap, so he has all the characteristics of HashMap and is better than blue. He has some characteristics that HashMap does not have. Next, let's see how capable this closed door disciple is
This article will introduce LinkedHashMap from the following points:
1. Introduction and simple use of LinkedHashMap
2. Structure of LinkedHashMap and comparison with HashMap
3. Insertion and deletion of LinkedHashMap
4. Source code analysis of LinkedHashMap
5. LRU caching based on LinkedHashMap
6. Summary
This article is expected to take 20 minutes. Please allocate your time reasonably. (don't panic, don't panic)
LinkedHashMap introduction and simple use
Let's take a look at the inheritance structure of LinkedHashMap:
LinkedHashMap is a member of the map family and directly inherits from HashMap. Therefore, it has all the characteristics of HashMap, such as efficient search of elements. At the same time, LinkedHashMap maintains the order of internal elements. There are two order modes to maintain the insertion order or access order of elements. Therefore, it is suitable for scenes where the order of elements needs to be maintained. In addition, because LinkedHashMap has the mode of maintaining the access order of elements, it is often used as LRU cache (least recently used cache, i.e. the least recently used cache policy, which will be introduced later).
The structure of LinkedHashMap and its comparison with HashMap
The structure of LinkedHashMap is more complex than that of HashMap. Firstly, it has the same structure as HashMap. Elements are stored in the form of array + linked list. In addition, all elements are connected by double linked list, which is equivalent to the combination of HashMap and LinkedList. It is precisely because linked list connection is used between elements, All can keep the elements in a certain order. However, it also increases the cost of maintaining this linked list. Each time you add and delete elements, you need to adjust not only the HashMap structure, but also the linked list structure. However, the cost of linked list adjustment is not large. In general, it can be ignored except for scenarios with a large amount of data. On the other hand, because all elements are connected by a linked list, the traversal efficiency is slightly higher than that of HashMap. When HashMap traverses, it needs to traverse to the end of the linked list in each bucket, and then to the next bucket. When there are few elements and a large number of empty buckets, there will be many invalid accesses. Therefore, in theory, The traversal efficiency of LinkedHashMap is higher than that of HashMap. Having said so much, it seems that LinkedHashMap is better than HashMap everywhere. Why not use LinkedHashMap?
OK, let's take a look at the nodes in LinkedHashMap:
Mysterious inheritance relationship. After reading this figure, you may think that your circle is really chaotic. The treenode in the parent class inherits from the entry in the child class, and the entry in the child class inherits from the node in the parent class. Don't panic. This naturally makes sense. Let's review the structure of node. Node is an ordinary node class in HashMap. The internal structure is as follows:
The top entry is an internal interface in the map interface. It only specifies the methods that the entry should have, and has no member variables. In LinkedHashMap, the internal class of entry is as follows:
Before and after references are added to form a double linked list together with other entries. The node not only stores the key value pair information, but also uses next, before and after to link the front and rear nodes. The next reference is only used to link the last element in the same bucket in the HashMap structure, while the before and after references are used to link all elements in the LinkedHashMap, Link it into a double linked list. Next, look at a comparison diagram with the HashMap structure, which is very clear:
Let's go back to the previous question. Why should treenode inherit from the entry in the LinkedHashMap instead of directly inheriting from the node? After all, such a linked list property does not need to be used in the HashMap. Adding after and before references will only waste space. First, it is natural to reuse the code. If treenode inherits directly from the node, it will lose the linked list property, After LinkedHashMap inherits from HashMap, it needs to use another internal class to inherit treenode to make it have the characteristics of linked list, and the relevant methods need to be rewritten, which greatly increases the workload and reduces the maintainability of the code. In addition, HashMap will only convert the nodes to treenode when the number of elements in the bucket reaches a certain number, When the hash table hash is good, treenode is rarely used, so this little waste of space is worth it.
LinkedHashMap can be regarded as composed of two parts. One part is completely consistent with the HashMap structure, and the other part is a double linked list. Because LinkedHashMap directly inherits from HashMap, the structure maintenance of hash table is carried out in the code in HashMap, and only the double linked list is maintained in LinkedHashMap. This can also well divide the functions and make the code structure clearer.
Let's feel the difference between the two through the code:
The two hash tables insert the same elements and traverse in the same way. HashMap gets unordered results, while LinkedHashMap gets results consistent with the insertion order. The root of this difference lies in the double linked list in the linked HashMap.
Next, we understand the maintenance process of double linked list through the insertion and deletion process of LinkedHashMap.
Insertion and deletion of LinkedHashMap
If you haven't seen the code, you may think that LinkedHashMap should be implemented by overriding the put and remove methods, but in fact LinkedHashMap does not override the insert and delete methods. This can be found by observing the code structure of LinkedHashMap:
So back to the previous chestnut, since the put method is not overwritten, why does calling the put method in LinkedHashMap get different results from the put method in HashMap? Let's take another look at the implementation of the put method:
The put method calls the putval method. In the last few lines of the putval method, there are two methods worth noting,
afterNodeAccess(e);
afterNodeInsertion(evict);
They are called after the element is accessed and the element is inserted. The two callbacks are not implemented in HashMap. There is only one shell and a LinkedHashMap to cover it.
Let's look at the newnode method overwritten in LinkedHashMap:
Now let's sort out the logic. When inserting a node, the overwritten newnode method will be called to link the inserted element to the tail of the linked list. When deleting a node, connect the front and rear nodes of the node. When the node is accessed, You can put it at the end of the linked list (this feature will be explained later). If you still don't know much about the insertion and deletion of the double linked list, you can go back to the previous description of the LinkedList, which contains a detailed explanation on the structure adjustment of the double linked list. (justifiably lazy)
Source code analysis of LinkedHashMap
Let's take a look at its internal classes. The node class entry has been introduced earlier. Four of the remaining classes are iterators and three are collection classes of keys, values and key value pairs. Let's take a look at the iterator class. In LinkedHashMap, we also implement our own iterators internally. The internal iterators inherit from the linkedhashmeter class. The code of this class is as follows:
This is an abstract class that implements the main methods of the iterator, hashnext, remove and nextnode methods. It is equivalent to a class template. Other iterator classes only need to inherit this class and add a next method.
Well, it's not easy.
The class structure of the remaining three collection views is basically the same as that of the corresponding collection views in HashMap. If you don't believe it, you can compare them:
Simply return the respective internal iterator instances in the iterator method. The rest of the logic is basically the same.
Linkedkeyset and linkedvalues are also simple:
After reading the inner class, let's take a look at its constructors:
It is almost the same as the constructor in HashMap, except that there is a special member variable, accessorder. What is this variable for? It is actually a flag indicating the storage order of elements in LinkedHashMap. As mentioned earlier, elements in LinkedHashMap can be stored in two orders, one is the insertion order of elements, The other is the access order of elements. If accessorder is true, the access order is used, that is, the most recently accessed element is at the end of the list. If accessorder is false, the insertion order is used, that is, the last inserted element is at the end of the list. The default is to use the insertion order. Of course, the specified order can also be used through the last constructor. If you are careful enough, you will find such a method:
The removeeldestentry method returns false by default, so when inserting a node, the previous node will not be deleted by default, but we can change this feature through inheritance.
Implementation of LRU cache based on LinkedHashMap
First, let's make a relatively simple introduction to LRU cache:
LRU (least recently used) algorithm eliminates data according to the historical access record of data. Its core idea is that "if the data has been accessed recently, the probability of being accessed in the future is higher".
It is actually a cache elimination strategy. When the number of caches reaches a threshold, the least recently accessed data will be eliminated first. When we use LinkedHashMap to implement LRU caching, we can override the removeeldestentry method to implement LRU caching of custom policies. For example, we can judge whether to remove the least recently accessed node according to the number of nodes, or whether to remove the node according to the survival time of the node. The cache implemented by chestnut below is based on the strategy of judging whether the number of nodes exceeds the limit. The maximum number of nodes passed in when constructing the cache object. When the number of inserted nodes exceeds the maximum number of nodes, the least recently accessed node is removed. The implementation code is as follows:
The output is as follows:
In the above code, the cache size is set to 3. When 10 key value pairs are inserted into the cache, only the last 3 are saved and the others are removed. Then, by accessing the node with the key value of I8, the node is moved to the last position of the two-way linked list. When we insert a key value pair again, the node with key value i7 will be eliminated.