Blood spitting finishing: 13 military regulations for redis performance optimization! The most complete in history
In this article, we will use the following methods to improve the running speed of redis:
1. Shorten the storage length of key value pairs
The length of key value pairs is inversely proportional to the performance. For example, let's do a group of performance tests for writing data. The execution results are as follows:
It can be seen from the above data that when the key remains unchanged, the greater the value, the slower the operation efficiency, because redis will use different internal codes for the same data type, For example, there are three internal encoding of strings: int (integer encoding), raw (string encoding for optimizing memory allocation) and embstr (dynamic string encoding). This is because the author of redis wants to balance efficiency and space through different encoding. However, the larger the amount of data, the more complex the internal encoding used, and the more complex the internal encoding, the lower the performance of storage.
This is only the speed of writing. When the key value pair content is large, it will bring several other problems:
Therefore, while ensuring the complete semantics, we should try to shorten the storage length of key value pairs. If necessary, we should serialize, compress and store the data. Taking Java as an example, we can use protostuff or kryo for serialization and snappy for compression.
2. Use the lazy free feature
Lazy free feature is a very useful new feature in redis 4.0. It can be understood as lazy deletion or delayed deletion. This means that the function of asynchronously delaying the release of key values is provided during deletion. The key value release operation is processed in a separate sub thread of bio (background I / O) to reduce the blocking of deletion on redis main thread and effectively avoid the performance and availability problems caused by deleting big keys.
Lazy free corresponds to four scenarios, all of which are closed by default:
lazyfree-lazy-eviction no
lazyfree-lazy-expire no
lazyfree-lazy-server-del no
slave-lazy-flush no
复制代码
They represent the following meanings:
It is recommended to enable lazyfree lazy instance, lazyfree lazy expire, lazyfree lazy server del and other configurations, so as to effectively improve the execution efficiency of the main thread.
3. Set the expiration time of the key value
We should set a reasonable expiration time for the key value according to the actual business situation. In this way, redis will help you automatically clear the expired key value pairs to save memory occupation, avoid excessive accumulation of key values and frequently trigger the memory elimination strategy.
4. Disable long-time query commands
The time complexity of most read-write commands in redis is between O (1) and O (n). The official document has a description of the time complexity of each command, address: redis IO / commands, as follows
Where o (1) indicates that it can be used safely, while o (n) should be careful. N indicates uncertainty. The larger the data, the slower the query speed may be. Because redis only uses one thread for data query, if these instructions take a long time, redis will be blocked and cause a lot of delay.
To avoid the impact of O (n) command on redis, we can start from the following aspects:
5. Use slowlog to optimize time-consuming commands
We can use the slowlog function to find the most time-consuming redis commands for related optimization to improve the running speed of redis. Slow query has two important configuration items:
We can make corresponding configuration according to the actual business situation. The slow log is stored in the slow query log in reverse order of insertion. We can use slowlog get n to obtain the relevant slow query logs, and then find the corresponding businesses of these slow queries for relevant optimization.
6. Batch operation of data using pipeline
Pipeline (pipeline technology) is a batch processing technology provided by the client, which is used to process multiple redis commands at one time, so as to improve the performance of the whole interaction.
We use java code to test the performance comparison between pipeline and ordinary operations. The test code of pipeline is as follows:
public class PipelineExample {
public static void main(String[] args) {
Jedis jedis = new Jedis("127.0.0.1",6379);
// 记录执行开始时间
long beginTime = System.currentTimeMillis();
// 获取 Pipeline 对象
Pipeline pipe = jedis.pipelined();
// 设置多个 Redis 命令
for (int i = 0; i < 100; i++) {
pipe.set("key" + i,"val" + i);
pipe.del("key"+i);
}
// 执行命令
pipe.sync();
// 记录执行结束时间
long endTime = System.currentTimeMillis();
System.out.println("执行耗时:" + (endTime - beginTime) + "毫秒");
}
}
复制代码
The results of the above procedures are as follows:
Common operation codes are as follows:
public class PipelineExample {
public static void main(String[] args) {
Jedis jedis = new Jedis("127.0.0.1",6379);
// 记录执行开始时间
long beginTime = System.currentTimeMillis();
for (int i = 0; i < 100; i++) {
jedis.set("key" + i,"val" + i);
jedis.del("key"+i);
}
// 记录执行结束时间
long endTime = System.currentTimeMillis();
System.out.println("执行耗时:" + (endTime - beginTime) + "毫秒");
}
}
复制代码
The results of the above procedures are as follows:
From the above results, it can be seen that the execution time of pipeline is 297 milliseconds, while that of ordinary command is 17276 milliseconds. Pipeline technology is about 58 times faster than ordinary execution.
7. Avoid simultaneous failure of a large amount of data
The greedy policy is used to delete the expired key value of redis. It will conduct 10 expiration scans per second. This configuration can be found in redis.com Conf is configured. The default value is Hz 10. Redis will randomly select 20 values and delete the expired keys among the 20 keys. If the proportion of expired keys exceeds 25%, repeat this process, as shown in the following figure:
If a large number of caches expire at the same time in a large system, redis will continue to scan and delete expired dictionaries for many times until the expired key values in the expired dictionaries are deleted sparsely. In the whole execution process, redis will have an obvious jam in reading and writing. Another reason for the jam is that the memory manager needs to recycle memory pages frequently, Therefore, it will consume a certain amount of CPU.
In order to avoid this phenomenon, we need to prevent a large number of caches from expiring at the same time. The simple solution is to add a random number in a specified range based on the expiration time.
8. Client usage optimization
In addition to using pipeline technology as much as possible, we should also pay attention to using redis connection pool as much as possible instead of creating and destroying redis connections frequently, so as to reduce the number of network transmission and unnecessary call instructions.
9. Limit redis memory size
In the 64 bit operating system, the memory size of redis is unlimited, that is, the configuration item maxmemory is commented out, which will cause the swap space to be used when the physical memory is insufficient. When the system is worried about moving the memory pages used by redis to the swap space, the redis process will be blocked and redis will be delayed, This will affect the overall performance of redis. Therefore, we need to limit the memory size of redis to a fixed value. When redis reaches this value, the memory elimination strategy will be triggered. There are 8 kinds of memory elimination strategies after redis 4.0:
In redis version 4.0, two new elimination strategies are added:
Where allkeys XXX means to eliminate data from all key values, while volatile XXX means to eliminate data from key values with expired keys set.
We can set it according to the actual business situation. The default elimination strategy does not eliminate any data, and an error will be reported when adding.
10. Use physical machines instead of virtual machines
Running redis server in the virtual machine shares a physical network port with the physical machine, and a physical machine may have multiple virtual machines running. Therefore, it will have a very bad performance in terms of memory occupation and network delay. We can use it/ Redis cli -- intrinsic latency 100 command to view the delay time. If there are high requirements for redis performance, redis servers should be deployed directly on physical machines as far as possible.
11. Check the data persistence strategy
Redis's persistence strategy is to copy the memory data to the hard disk so that disaster recovery or data migration can be carried out. However, maintaining this persistence function requires a lot of performance overhead.
After redis 4.0, redis has three persistence methods:
RDB and AOF persistence have their own advantages and disadvantages. RDB may lead to data loss within a certain period of time, while AOF will affect the startup speed of redis due to large files. In order to have the advantages of RDB and AOF at the same time, a hybrid persistence method has been added after redis 4.0. Therefore, we should choose the hybrid persistence method when persistence operation is necessary.
To query whether to enable mixed persistence, you can use the config get AOF use RDB prepare command. The execution results are shown in the following figure:
Where yes indicates that mixed persistence has been enabled, no indicates that it has been closed, and redis 5.0 defaults to yes. For other versions of redis, first check whether hybrid persistence has been enabled. If it is closed, it can be enabled in the following two ways:
① Open via command line
Use the command config set AOF use RDB preamble yes. The execution result is shown in the following figure:
The disadvantage of setting the configuration on the command line is that after restarting the redis service, the set configuration will become invalid.
② Enable by modifying redis configuration file
Find redis in the root path of redis Conf file, change AOF use RDB preamble no in the configuration file to AOF use RDB preamble yes, as shown in the following figure:
After the configuration is completed, the redis server needs to be restarted before the configuration can take effect. However, the configuration information will not be lost after the redis service is restarted each time by modifying the configuration file.
It should be noted that in businesses that do not have to be persistent, persistence can be turned off, which can effectively improve the running speed of redis and avoid intermittent jamming.
12. Disable THP feature
Linux kernel in 2.6 The 38 kernel adds the transparent tiger pages (THP) feature, supports 2MB allocation of large memory pages, and is enabled by default.
When THP is enabled, the speed of fork will slow down. After fork, each memory page will change from 4KB to 2MB, which will greatly increase the memory consumption of the parent process during rewriting. At the same time, the copy memory page unit caused by each write command is enlarged by 512 times, which will slow down the execution time of write operations, resulting in a large number of slow queries of write operations. For example, the simple incr command will also appear in the slow query, so redis recommends disabling this feature as follows:
In order to make the THP configuration still take effect after the machine is restarted, you can Echo never > / sys / kernel / mm / transparent is added in local_ hugepage/enabled。
13. Use distributed architecture to increase read and write speed
Redis distributed architecture has three important means:
Using the master-slave synchronization function, we can execute writes on the master database and transfer the read function to the slave service. Therefore, we can process more requests per unit time, so as to improve the overall running speed of redis.
The sentinel mode is an upgrade of the master-slave function, but when the master node breaks down, it can automatically restore the normal use of redis without manual intervention.
Redis cluster is officially launched by redis 3.0. Redis cluster balances the load pressure of each node by storing databases in multiple nodes.
Redis cluster uses virtual hash slot partition, and all keys are mapped to 0 ~ 16383 integer slots according to the hash function. The calculation formula is: slot = CRC16 (key) & 16383. Each node is responsible for maintaining part of the slots and the key value data mapped by the slots. In this way, redis can distribute the read-write pressure from one server to multiple servers, so the performance will be greatly improved.
We only need to use one of these three functions. There is no doubt that redis cluster should be the preferred implementation scheme. It can automatically share the read-write pressure to more servers and has the ability of automatic disaster recovery.