Quickly understand NiO core components in Java
background knowledge
Synchronous, asynchronous, blocking, non blocking
First of all, these concepts are very confusing, but they are involved in NiO, so let's summarize.
Synchronization: when the API call returns, the caller will know the result of the operation (how many bytes are actually read / written).
Asynchronous: compared with synchronous, when the API call returns, the caller does not know the result of the operation, and the notification result will be called back later.
Blocking: when no data is readable or all data cannot be written, suspend the current thread and wait.
Non blocking: when reading, you can read as much data as you can and then return. When writing, you can write as much data as you can and then return.
For I / O operations, according to the documents on the Oracle official website, the division standard of synchronization and asynchrony is "whether the caller needs to wait for the I / O operation to complete". This "waiting for the I / O operation to complete" does not mean that the data must be read or written, but when the I / O operation is actually carried out, For example, during the period when the data is transmitted between the TCP / IP protocol stack buffer and the JVM buffer, whether the caller needs to wait.
Therefore, our commonly used read () and write () methods are synchronous I / O. synchronous I / O is divided into blocking and non blocking modes. If it is non blocking mode, it returns directly when no data is detected to be readable, and does not really perform I / O operations.
To sum up, there are actually only three mechanisms in Java: synchronous blocking I / O, synchronous non blocking I / O and asynchronous I / O. we will talk about the first two, jdk1 7 began to introduce asynchronous I / O, which is called NiO 2。
Traditional IO
We know that the emergence of a new technology is always accompanied by improvement and promotion, as is the emergence of javanio.
Traditional I / O is blocking I / O, and the main problem is the waste of system resources. For example, in order to read the data of a TCP connection, we call the read () method of InputStream, which will suspend the current thread and wake up until the data arrives. During the time when the data arrives, Occupying memory resources (store thread stack) does nothing, that is, as the saying goes, it takes up the pit and doesn't shit. In order to read the data of other connections, we have to start another thread. When the number of concurrent connections is small, this may be no problem. However, when the number of connections reaches a certain scale, memory resources will be consumed by a large number of threads. On the other hand, thread switching needs to be more efficient Changing the state of the processor, such as the value of program counters and registers, so switching between a large number of threads very frequently is also a waste of resources.
With the development of technology, modern operating system provides a new I / O mechanism to avoid this waste of resources. Based on this, javanio was born. The representative feature of NiO is non blocking I / O. Then we found that simply using non blocking I / O can not solve the problem, because in the non blocking mode, the read () method will return immediately when the data is not read. We don't know when the data will arrive, so we can only keep calling the read () method for retry, which obviously wastes CPU resources, The selector component is designed to solve this problem.
Javanio core components
1.Channel
concept
All I / O operations in javanio are based on channel objects, just as stream operations are based on stream objects. Therefore, it is necessary to understand what channel is first. The following is extracted from jdk1 8 documents
Achannelrepresentsanopenconnectiontoanentitysuchasahardwaredevice,afile,anetworksocket,oraprogramcomponentthatiscapableofperformingoneormoredistinctI/Ooperations,forexamplereadingorwriting.
As can be seen from the above, a channel represents a connection with an entity, which can be files, network sockets, etc. in other words, a channel is a bridge provided by javanio for the interaction between our programs and the underlying I / O services of the operating system.
Channel is a very basic and abstract description. It interacts with different I / O services, performs different I / O operations, and has different implementations. Therefore, it includes filechannel, socketchannel, etc.
The channel is more like a stream. You can read data into the buffer or write the data in the buffer to the channel.
Of course, there are differences, mainly reflected in the following two points:
A channel can read and write, and a stream is unidirectional (so it is divided into InputStream and OutputStream)
Channel has non blocking I / O mode
realization
The most commonly used channel implementations in javanio are as follows. It can be seen that they correspond to the traditional I / O operation classes one by one.
Filechannel: read and write files
Datagram channel: UDP protocol network communication
Socketchannel: TCP protocol network communication
Serversocketchannel: listen for TCP connections
2.Buffer
The buffer used in NiO is not a simple byte array, but an encapsulated buffer class. Through its API, we can manipulate data flexibly. See the following details.
Corresponding to java basic types, NiO provides a variety of buffer types, such as ByteBuffer, charbuffer, intbuffer, etc. the difference is that the unit length is different when reading and writing buffers (read and write in units of variables of corresponding types).
There are three very important variables in buffer. They are the key to understand the working mechanism of buffer. They are
Capacity (total capacity)
Position (current position of pointer)
Limit (read / write boundary position)
The working mode of buffer is very similar to the character array in C language. By analogy, capacity is the total length of the array, position is the subscript variable of our read / write characters, and limit is the position of the terminator. The initial three variables of buffer are shown in the following figure
In the process of reading / writing to the buffer, position will move back, and limit is the boundary of position movement. Therefore, it is not difficult to imagine that when writing to the buffer, the limit should be set to the size of capacity, and when reading from the buffer, the limit should be set to the actual end position of the data. (Note: writing buffer data to the channel is a buffer read operation, and reading data from the channel to the buffer is a buffer write operation)
Before reading / writing to the buffer, we can call some auxiliary methods provided by the buffer class to correctly set the values of position and limit, mainly as follows
Flip (): set limit to position, and then position to 0. Call Buffer before reading operation.
Rewind(): just set position to 0. It is usually called before re reading the Buffer data, for example, when you read the same Buffer data to write multiple channels, it will be used.
Clear (): return to the initial state, that is, limit equals capacity and position is set to 0. Call again before Buffer is written.
Compact(): move the unread data (the data between position and limit) to the beginning of the buffer and set position to the next position at the end of the data. In fact, it is equivalent to writing such a piece of data to the buffer again.
Then, take a look at an example and use filechannel to read and write text files. Through this example, verify the readability and Writeability of the channel and the basic usage of buffer (note that filechannel cannot be set to non blocking mode).
In this example, two buffers are used, in which ByteBuffer is used as the data buffer for channel reading and writing, and charbuffer is used to store decoded characters. The usage of clear() and flip() is as described above. It should be noted that the last compact() method is necessary even if the size of charbuffer is enough to accommodate the data decoded by ByteBuffer. This is because the UTF-8 encoding of common Chinese characters takes up 3 bytes, so there is a high probability of truncation in the middle. See the following figure:
When the decoder reads 0xe4 at the end of the buffer, it cannot be mapped to a Unicode. The function of the third parameter false of the decode () method is to make the decoder treat the unmapped bytes and their subsequent data as additional data. Therefore, the decode () method will stop here and the position will fall back to 0xe4. In this way, the first byte of the "medium" word encoding is left in the buffer. It must be compact to the front to splice the correct and subsequent data. For character coding, you can refer to the example explanation of coding concepts such as ANSI, Unicode, BMP and UTF