About redis
Introduction to redis
Redis is an open source in memory data structure storage system, which can be used as database, cache and message middleware.
It supports many types of data structures, such as string, hash, list, set, sorted set or Zset, range query, bitmaps, hyperlogs and geospatial index radius query. The common data structure types are string, list, set, hash and Zset.
Redis has built-in replication, Lua scripting, LRU events, transactions and different levels of disk persistence, and provides high availability through redis sentinel and cluster.
Redis also provides persistence options that allow users to save their data to disk for storage. According to the actual situation, the dataset can be exported to the disk (snapshot) or appended to the command log at regular intervals (AOF only appends files). When executing the write command, it will copy the executed write command to the hard disk. You can also turn off the persistence function and use redis as an efficient network data caching function.
Redis does not use tables, and its database will not predefine or force users to associate different data stored in redis.
The working mode of database can be divided into hard disk database and memory database according to storage mode. Redis stores data in the memory. When reading and writing data, it will not be limited by the I / O speed of the hard disk, so the speed is very fast.
How fast is redis
Redis adopts a memory based kV database with a single process and single thread model, which is written in C language. The officially provided data can reach 100000 + QPS (query times per second). This data is no worse than the memory based kV database memcached with a single process and multiple threads!
Why is redis so fast
The above points are easy to understand. Next, we will briefly discuss the multiplex I / O multiplexing model:
Multiplex I / O multiplexing model
The multiplex I / O multiplexing model uses the ability of select, poll and epoll to monitor the I / O events of multiple streams at the same time. When it is idle, it will block the current thread. When one or more streams have I / O events, it will wake up from the blocking state, So the program will poll all streams (epoll only polls those streams that actually send events) and only process the ready streams in sequence, which avoids a lot of useless operations.
Here, "multiplexing" refers to multiple network connections, and "multiplexing" refers to multiplexing the same thread. Using the multiplex I / O multiplexing technology can enable a single thread to efficiently process multiple connection requests (minimize the time consumption of network IO), and redis operates data in memory very fast, that is, the operation in memory will not become a bottleneck affecting redis performance. The above points mainly make redis have high throughput.
Why is redis single threaded
First of all, we should understand that the above analysis is to create a fast redis atmosphere! According to the official FAQ, because redis is a memory based operation, the CPU is not the bottleneck of redis. The most likely bottleneck of redis is the size of machine memory or network bandwidth. Since single thread is easy to implement and CPU will not become a bottleneck, it is logical to adopt the single thread scheme (after all, using multithreading will have a lot of trouble!).
However, we can't give full play to the performance of multi-core CPU by using single thread, but we can improve it by opening multiple redis instances on a single machine!
Warning 1: the single thread we have been emphasizing here is only one thread when processing our network requests. There must be more than one thread when a formal redis server is running. You need to pay attention here! For example, when redis is persistent, it will be executed as a sub process or sub thread.
Warning 2: in the last paragraph of the FAQ in the figure above, it states that from redis version 4.0, multithreading will be supported. However, multithreading operations are only performed on some operations! Therefore, whether this article is still a single threaded way in future versions needs to be verified by readers!
Attention
1. We know that redis uses the "single thread multiplex IO model" to implement high-performance memory data services. This mechanism avoids the use of locks, but at the same time, this mechanism will reduce the concurrency of redis when carrying out time-consuming commands such as Sunion. Because it is a single thread, there is only one operation in progress at the same time. Therefore, time-consuming commands will reduce concurrency, not only read concurrency, but also write concurrency. A single thread can only use one CPU core, so multiple instances can be started in the same multi-core server to form a master master or master slave. Time-consuming read commands can be carried out completely in the slave.
2. "We can't let the operating system load balance, because we know our programs better, so we can manually allocate CPU cores for them without taking up too much CPU, or crowding our key processes with a bunch of other processes.". CPU is an important factor. Due to the single thread model, redis prefers large cache and fast CPU rather than multi-core
On multi-core CPU servers, redis's performance also depends on NUMA configuration and processor binding location. The most obvious effect is that redis benchmark randomly uses the CPU kernel. In order to obtain accurate results, you need to use a fixed processor tool (taskset can be used on Linux). The most effective way is to separate the client and server into two different CPUs to use L3 cache.
extend
Here are some models you should know. I wish you a hand in your interview!
1. Single process multithreading model: MySQL, memcached, Oracle (Windows version);
2. Multi process model: Oracle (Linux version);
3. Nginx has two types of processes, one is called master process (equivalent to management process) and the other is called worker process (actual work process). There are two startup methods:
Single process startup: at this time, there is only one process in the system, which plays the role of both master process and worker process.
Multi process startup: at this time, the system has and only has one master process, and at least one worker process works.
The master process mainly performs some global initialization and management of workers; Event processing is performed in the worker.