Java concurrent queues and containers
[Preface: it is one of the necessary skills for both big data practitioners and Java practitioners to master Java high concurrency and multithreading. This paper mainly expounds the blocking queue and concurrency container under Java concurrency package. In fact, those who have studied the source code of big data related technologies such as spark and storm will find that most of them use Java concurrency queue, synchronization container, reentrantlock, etc. I suggest you carefully analyze the relevant source code in combination with this article] @ h_ 404_ 3@
BlockingQueue@H_404_3 @
Blocking queue, located in Java util. Under concurrent and contract, it solves the problem of how to transmit data safely and efficiently in multithreading. The so-called "blocking" means that in some cases, the thread is suspended and will be automatically awakened when certain conditions are met, which can be controlled through the API@ H_ 404_ 3@
Common blocking queues are mainly divided into two types: FIFO (first in first out) and LIFO (last in first out). Of course, many different types of queues can be extended through different implementation methods. First, learn about several core APIs of BlockingQueue: put and take, a pair of blocking access; add and poll, a pair of non blocking access. @ h_404_3@
Insert data @ h_ 404_ 3@
Put (anobj): add anobj to BlockingQueue. If there is no space in blockqueue, the thread calling this method will be blocked until there is space in BlockingQueue, and then continue to insert @ H_ 404_ 3@
Add (anobj): add anobj to BlockingQueue. If BlockingQueue can accommodate it, return true. Otherwise, throw an exception @ H_ 404_ 3@
Offer (anobj): if possible, add anobj to BlockingQueue. If BlockingQueue can accommodate, return true; otherwise, return false@ H_ 404_ 3@
Read data @ h_ 404_ 3@
Take(): take the first object in the BlockingQueue. If the BlockingQueue is empty, the block will enter the waiting state until a new object is added to the BlockingQueue @ h_ 404_ 3@
Poll (time): take the first object in the BlockingQueue. If it cannot be taken out immediately, you can wait for the time specified by the time parameter and return when it cannot be taken null@H_404_3 @
BlockingQueue core member introduction @ h_ 404_ 3@
ArrayBlockingQueue@H_404_3 @
Bounded blocking queue based on array implementation. Because it is implemented based on array, it has the characteristics of fast search and slow addition and deletion@ H_ 404_ 3@
Producers and consumers use the same lock, which can not be executed in parallel, and the efficiency is low. It uses a standard mutex reentrantlock at the bottom, that is, read, write and write are mutually exclusive. Of course, it can control whether fair locks are used inside the object. The default is non fair locks. The consumption mode is FIFO@ H_ 404_ 3@
When producing and consuming data, enumeration objects are directly inserted or deleted without generating or destroying additional object instances@ H_ 404_ 3@
Application: because the underlying production and consumption use the same lock, the fixed length array does not need to create and destroy objects frequently. It is suitable for the scenario where you want to perform tasks in the queue order and do not want frequent GC@ H_ 404_ 3@
@H_ 404_ 3@
LinkedBlockingQueue@H_404_3 @
The blocking queue based on linked list also has the characteristics of fast addition and deletion and slow positioning@ H_ 404_ 3@
One thing to note: the capacity of the linkedblockingqueue created by default is integer MAX_ Value. In this case, if the producer's speed is greater than the consumer's speed, the system memory may have been consumed before the queue is full and blocking occurs. You can avoid this extreme situation by creating a linkedblockingqueue with a specified capacity@ H_ 404_ 3@
Although reentrantlock is also used in the bottom layer, take and put are separated (the production and consumption locks are not the same lock), and the efficiency is still higher than arrayblockingqueue in high concurrency scenarios. The put method will block when the queue is full until a queue member is consumed, and the take method will block when the queue is empty until a queue member is put in. @ h_404_3@
@H_ 404_ 3@
DelayQueue@H_404_3 @
Delayqueue is a queue with no size limit. Therefore, the operation (producer) that inserts data into the queue will never be blocked, but only the operation (consumer) that obtains data will be blocked. The element in delayqueue can be obtained from the queue only when the specified delay time expires. @ h_404_3@
Application scenario: @ h_ 404_ 3@
1. The client occupies the connection for a long time. If the idle time is exceeded, you can remove @ H_ 404_ 3@
2. Handle cache that is not used for a long time: if the objects in the queue are not used for a long time and exceed the idle time, remove @ H_ 404_ 3@
3. Task timeout processing @ h_ 404_ 3@
@H_ 404_ 3@
PriorityBlockingQueue@H_404_3 @
Priorityblockingqueue will not block data producers, but will only block data consumers when there is no data to consume. Therefore, the speed of producer production data must be controlled to avoid that consumers can not keep up with the speed of consumer consumption data. Otherwise, over time, all available heap memory space will eventually be exhausted@ H_ 404_ 3@
When adding an element to priorityblockingqueue, the element defines the logic of priority by overriding compareto() by implementing the comparable interface. Its internal lock controlling thread synchronization adopts fair lock@ H_ 404_ 3@
@H_ 404_ 3@
SynchronousQueue@H_404_3 @
A non buffered waiting queue that executes a task when it comes to it. No tasks can be added during this period. That is, there is no need to block. In fact, this method is more efficient for a small number of tasks@ H_ 404_ 3@
There are two different ways to declare a synchronous queue, fair mode and unfair mode: @ h_ 404_ 3@
Fair mode: the synchronousqueue will adopt fair lock and cooperate with a FIFO queue to block redundant producers and consumers, so as to reflect the overall fair strategy@ H_ 404_ 3@
Unfair model (synchronousqueue default): synchronousqueue uses unfair locks and a LIFO queue to manage redundant producers and consumers. In the latter mode, if there is a gap in processing speed between producers and consumers, it is easy to be hungry, that is, the data of some producers or consumers can never be processed. @ h_404_3@
@H_ 404_ 3@
ConcurrentLinkedQueue@H_404_3 @
No lock, high concurrency scenario efficiency is much higher than arrayblockingqueue, linkedblockingqueue and other @ h_ 404_ 3@
@H_ 404_ 3@
Container @ h_ 404_ 3@
Synchronization class container @ h_ 404_ 3@
Class 1: vector, stack and hashtable are synchronous classes and thread safe, but problems may still occur in high concurrency scenarios, such as concurrentmodificationexception@ H_ 404_ 3@
The second category: some factory classes (static) provided by collections, which are inefficient @ h_404_3@
@H_ 404_ 3@@H_ 404_ 3@
@H_ 404_ 3@
Concurrent class container @ h_ 404_ 3@
Copyonwrite container @ h_ 404_ 3@
Copy on write container: when we add elements to a container, we do not directly add them to the current container. Instead, we first copy the current container, copy a new container, and then add elements to the new container. After adding elements, we point the reference of the original container to the new container. It is very suitable for scenarios with more reading and less writing@ H_ 404_ 3@
However, there are the following problems: @ h_ 404_ 3@
Data consistency: the copyonwrite container is weakly consistent, that is, it can only ensure the final consistency of data, not the real-time consistency of data. So if you want the written data to be read immediately, don't use the copyonwrite container@ H_ 404_ 3@
Memory occupation: because of copyonwrite's copy on write mechanism, the memory of two objects, the old object and the newly written object, will be stationed in the memory during write operation. If these objects occupy a large amount of memory and are not well controlled, such as writing too many scenarios, it is likely to cause frequent Yong GC and full GC. To solve the problem of memory occupation, you can reduce the memory consumption of large objects by compressing the elements in the container, or use other concurrent containers instead of copyonwrite container, such as concurrenthashmap@ H_ 404_ 3@
There are two common copyonwrite containers: copyonwritearraylist and copyonwritearrayset. Copyonwritearraylist is a thread safe variant of ArrayList@ H_ 404_ 3@
@H_ 404_ 3@
@H_ 404_ 3@
ConcurrentHashMap@H_404_3 @
The author points to jdk1 7 and jdk1 8 two parts describe concurrenthashmap@ H_ 404_ 3@
JDK1. seven ConcurrentHashMap@H_404_3 @
JDK1. 7. The "lock segmentation" technology is used to reduce the granularity of locks. It divides the whole map into a series of units composed of segments. A segment is equivalent to a hashtable. In this way, the locked object changes from the entire map to a segment. The reason why concurrent HashMap is thread safe and improves performance is that the reads in the map are concurrent and do not need to be locked; Locking is only performed during put and remove operations. Locking only locks the segments to be operated and will not affect the reading and writing of other segments. Therefore, different segments can be used concurrently, which greatly improves the performance@ H_ 404_ 3@
According to the source code, we can find, insert Deletion process: determine the segment through the hash of the key (expand the capacity if the segment size reaches the capacity expansion threshold during insertion) - > determine the hashentry subscript of the linked list array (obtain the chain header during insertion / deletion) - > traverse the linked list [query: call equals()) Compare, find the node equal to the found key and read it; Insert: if a node with the same key is found, the value value will be updated. If not, a new node will be inserted; Delete: after finding the deleted node, start to create a new linked list with the next node of the deleted node, then copy and insert the original chain header to the previous node of the deleted node in turn, and finally set the new chain header as the current array subscript element to replace the old linked list@ H_ 404_ 3@
JDK1. 8ConcurrentHashMap@H_404_3 @
JDK1. The concurrenthashmap in 8 is in jdk1 Many optimizations have been made on @ H_ 404_ 3@
1. Cancel the segments field, directly use transient volatile hashentry < K, V > [] table to save data, and use table array elements as locks, so as to lock each row of data and further reduce the probability of concurrent conflict by further reducing the lock granularity @ H_ 404_ 3@
2. Change the original data structure of table array + linked list to the structure of table array + linked list + red black tree. For hash tables, the core capability is that keys can be evenly distributed in the array after hashing. If the hash after hash is very uniform, the length of each queue in the table array is mainly 0 or 1. However, the actual situation is not always so ideal. Although the default loading factor of concurrenthashmap class is 0.75, when the amount of data is too large or bad luck, there will still be some cases where the queue length is too long. If the one-way list mode is still adopted, the time complexity of querying a node is O (n); Therefore, for lists with more than 8 (the default), jdk1 If the red black tree structure is adopted in 8, the query time complexity can be reduced to o (logn) and the performance @ h can be improved_ 404_ 3@
3. Add the field transient volatile countercell [] countercells, which can easily calculate the number of all elements in the set, and its performance is much better than jdk1 Size() method @ h in 7_ 404_ 3@
I believe that through these introductions, you will be interested in questions such as "why choose concurrenthashmap?" There will be a good idea@ H_ 404_ 3@