Thoughts on troubleshooting an online redis class conversion exception

2021-01-30 • Java

Previous colleagues reported that they encountered redis deserialization exception online. The exception is as follows:

Known information is as follows:

Because it happens occasionally, I first read whether there is a problem with the business logic of the exception report, and I found some problems after reading it again. After looking at the corresponding log, it is found that the exception occurred only after the reading timeout of redis. Therefore, it is suspected that it was caused by the redis client operation logic (the company architecture group has made a layer of encapsulation for redis). It is found that the code to obtain / release the redis connection is as follows:

It is preliminarily determined that the connection with read-write timeout is returned to the connection pool directly, and the data returned by the last redis is read when the connection is used next time. Therefore, under local verification, the example code is as follows:

The connection timeout is set to 2000ms. In order to facilitate testing, you can use the GDB command to disconnect the redis process on the redis server (if redis is deployed on Linux system, you can also use the IPtable command to prohibit a packet return on the firewall). For example, before executing jedis.get ("key1". Getbytes() code, you can use the GDB command to disconnect the redis process, which will lead to reading timeout, Then the following exception will be triggered:

Now that the cause of the problem is known and the problem is replicated locally, the corresponding solution is to close the connection when returning it to the connection pool when an exception occurs (jedis.close has made an internal judgment). The code is as follows:

So far, the problem has been solved. be careful, Because Hessian serialization is used (it contains type information, similar to Java's own serialization mechanism), all class conversion exceptions will be reported; if JSON serialization is used (it only contains object attribute information), no exception will be reported during deserialization, but because the attributes of different classes are different, the deserialized object attribute will be empty or the attribute value will be confused, which will cause problems when used, and this problem is more difficult to find because no exception is reported.

Now that we talk about the connection of redis, we should know that redis communicates based on resp (redis serialization protocol) protocol, and the communication mode is stop and wait, that is, one connection is exclusive for one communication, and the connection can not be released for use by other threads until the client reads the returned result. You guys can think about it, Can redis communication use the communication mode of single connection + serial number (identifying single communication) like Dubbo? Theoretically, it is possible, but because there is no "serial number" field in the resp protocol, it is unrealistic to directly rely on the native communication method. However, we can pass and return the "serial number" through the echo command +The normal read-write mode is adopted. To ensure the atomicity of their execution, they can be implemented through Lua script or transaction. The transaction mode is as follows:

Then, the result received by the client is a list of ["unique serial number", "value1"]. You can identify which request you sent according to the previous item.

Why didn't redis adopt a communication mode similar to Dubbo? Personally, I think there are the following points:

搜索内容

Thoughts on troubleshooting an online redis class conversion exception

热门文章