Six tough moves of current limiting, with detailed code and evaluation results

For the convenience of work, last year I rented out my house in the northern suburbs and moved to the southern suburbs. In this way, it is close to the place where I work. It saves me a lot of time and cost. I can use it to do a lot of meaningful things. At least, I won't be disturbed by traffic jams, and my happiness rises sharply.

But even so, life has other troubles. The residential density in the southern suburbs is relatively large, so parking has become a headache. I rent non fixed parking spaces on both sides of the road. Every time I come back from work, there must be no parking space. Therefore, I can only Park side by side with other people's cars, but the problem is that I have to be woken up by the phone to move my car every morning. Naturally, I don't have to say my mood.

But in the next few days, I gradually became smarter. When I parked the night before, I would find a car with traffic restrictions the next day and Park side by side, so I wouldn't have to move the car the next day. This is really a "huge bonus" brought to me by traffic restrictions.

Vehicle restriction is a very common flow restriction strategy in life. In addition to the above benefits, it has also brought a slight improvement to our beautiful living environment. In addition, the rapid growth of private cars has brought a huge "burden" to our traffic. If we don't restrict traffic, all cars may be blocked on the road, This is the great benefit that flow restriction brings to our life.

From life back to the program, suppose a system can only provide services for 10W people. Suddenly, one day, due to a hot event, the number of visits to the system in a short time increases rapidly to 50W, then the direct result is the system crash, and no one can use the system. Obviously, only a small number of people can use it, which is far more in line with our expectations than no one can use it, Therefore, we need to use "current limiting" at this time.

Current limiting classification

There are many implementation schemes for current limiting. Brother Lei has sorted it out here. The classification of current limiting is as follows:

Legitimacy verification and flow restriction is the most conventional business code, that is, ordinary verification code and IP blacklist system. This paper will not describe too much. We will focus on the implementation schemes of the latter two flow restrictions: container flow restriction and service end flow restriction.

Vessel current limiting

Tomcat current limiting

The maximum number of threads in Tomcat version 8.5 is conf / server XML configuration, as follows:

<Connector port="8080" protocol="HTTP/1.1"
          connectionTimeout="20000"
          maxThreads="150"
          redirectPort="8443" />

Maxthreads is the maximum number of threads in Tomcat. When the concurrency of the request is greater than this value (maxthreads), the request will be queued for execution, which completes the purpose of flow restriction.

Nginx current limiting

Nginx provides two current limiting methods: one is to control the rate, and the other is to control the number of concurrent connections.

Control rate

We need to use limit_ req_ Zone is used to limit the number of requests per unit time, that is, speed limit. The example configuration is as follows:

limit_req_zone $binary_remote_addr zone=mylimit:10m rate=2r/s;
server { 
    location / { 
        limit_req zone=mylimit;
    }
}

The above configuration indicates that the access speed of each IP is limited to 2R / s, because the current limit statistics of nginx is based on milliseconds, and the speed we set is 2R / S. in a conversion, only one request is allowed for a single IP within 500ms, and the second request is allowed from 501ms.

We sent and sent six requests within 10ms using a single IP. The execution results are as follows:

From the above results, we can see that his execution is in line with our expectations. Only one execution is successful, and the other five are rejected (the second will be executed normally only after 501ms).

Rate limiting upgrade

Although the above rate control is very accurate, it is too harsh to apply to the real environment. In reality, we should control the total number of accesses in the total time of an IP unit, rather than as accurate as the above but in milliseconds. We can use the burst keyword to enable this setting. The example configuration is as follows:

limit_req_zone $binary_remote_addr zone=mylimit:10m rate=2r/s;
server { 
    location / { 
        limit_req zone=mylimit burst=4;
    }
}

Burst = 4 means that each IP allows up to 4 burst requests. If a single IP sends 6 requests within 10ms, the results are as follows:

From the above results, we can see that one request was processed immediately, four requests were queued in the burst queue, and the other one was rejected.

Control concurrency

Use limit_ conn_ Zone and limit_ Conn two instructions can control the concurrency. The example configuration is as follows:

limit_conn_zone $binary_remote_addr zone=perip:10m;
limit_conn_zone $server_name zone=perserver:10m;
server {
    ...
    limit_conn perip 10;
    limit_conn perserver 100;
}

Where limit_ Conn perip 10 means that a single IP can hold up to 10 connections at the same time; limit_ Conn perserver 100 indicates that the total number of concurrent connections that the server can handle at the same time is 100.

Service end current limiting

The service end current limiting needs to be implemented with the current limiting algorithm, and the algorithm is equivalent to the "brain" of current limiting, which is used to guide the implementation of the limiting scheme.

Some people may faint when they see the word "algorithm". They think it's very profound. In fact, it's not. The algorithm is equivalent to the summary of the specific implementation steps of operating a transaction. In fact, it is not difficult to understand. Don't be frightened by its appearance~

There are three common algorithms for current limiting:

Let's look at it separately.

1. Time window algorithm

The so-called sliding time algorithm refers to taking a certain time forward with the current time as the deadline, for example, taking a time of 60s forward, and the maximum number of accesses running within this 60s is 100. At this time, the execution logic of the algorithm is to clear all request records before 60s, and then calculate whether the number of requests in the current set is greater than the set maximum number of requests 100, If it is greater than, execute the current limit rejection policy. Otherwise, insert the request record and return the ID that can be executed normally to the client.

The sliding time window is shown in the following figure:

Each small one represents 10s, and the time period surrounded by the red dotted line is the time interval to be judged. For example, 100 requests are allowed in 60s seconds, and the red dotted line is 60s.

We can use the ordered set Zset of redis to implement the flow restriction algorithm of time window. The implementation process is to first use the key of Zset to store the ID of flow restriction, and the score is used to store the time of request. After each request for access comes, first clear the access volume of the previous time window, count the number of current time windows and compare the maximum allowed access volume, If it is greater than or equal to the maximum number of accesses, false is returned to execute the current limiting operation, which is responsible for allowing the execution of business logic, and adding a valid access record in Zset. The specific implementation code is as follows.

We use jedis package to operate redis, which is implemented in POM XML, add a reference to the jedis framework, and the configuration is as follows:

<!-- https://mvnrepository.com/artifact/redis.clients/jedis -->
<dependency>
    <groupId>redis.clients</groupId>
    <artifactId>jedis</artifactId>
    <version>3.3.0</version>
</dependency>

The specific Java implementation code is as follows:

import redis.clients.jedis.Jedis;

public class RedisLimit {
    // Redis 操作客户端
    static Jedis jedis = new Jedis("127.0.0.1",6379);

    public static void main(String[] args) throws InterruptedException {
        for (int i = 0; i < 15; i++) {
            boolean res = isPeriodLimiting("java",3,10);
            if (res) {
                System.out.println("正常执行请求:" + i);
            } else {
                System.out.println("被限流:" + i);
            }
        }
        // 休眠 4s
        Thread.sleep(4000);
        // 超过最大执行时间之后,再从发起请求
        boolean res = isPeriodLimiting("java",10);
        if (res) {
            System.out.println("休眠后,正常执行请求");
        } else {
            System.out.println("休眠后,被限流");
        }
    }

    /**
     * 限流方法(滑动时间算法)
     * @param key      限流标识
     * @param period   限流时间范围(单位:秒)
     * @param maxCount 最大运行访问次数
     * @return
     */
    private static boolean isPeriodLimiting(String key,int period,int maxCount) {
        long NowTs = System.currentTimeMillis(); // 当前时间戳
        // 删除非时间段内的请求数据(清除老访问数据,比如 period=60 时,标识清除 60s 以前的请求记录)
        jedis.zremrangeByscore(key,NowTs - period * 1000);
        long currCount = jedis.zcard(key); // 当前请求次数
        if (currCount >= maxCount) {
            // 超过最大请求次数,执行限流
            return false;
        }
        // 未达到最大请求数,正常执行业务
        jedis.zadd(key,NowTs,"" + NowTs); // 请求记录 +1
        return true;
    }
}

The results of the above procedures are:

This implementation has two disadvantages:

2. Leaky bucket algorithm

The leaky bucket algorithm is inspired by the funnel, as shown in the figure below:

A problem with the sliding time algorithm is that within a certain range, for example, there can only be 10 requests in 60s. When 10 requests arrive in the first second, the remaining 59S can only reject all requests, and the leaky bucket algorithm can solve this problem.

The leaky bucket algorithm is similar to the funnel in life. No matter how large the water flows into the funnel, that is, no matter how many requests, it flows out slowly at a uniform speed. When the upper water flow speed is greater than the lower outflow speed, the funnel will slowly become full. When the funnel is full, new requests will be discarded; When the speed of the water flow above is less than the speed of the flow out below, the funnel will never be filled and can flow out all the time.

The implementation step of leaky bucket algorithm is to declare a queue to save requests. This queue is equivalent to a funnel. When the queue capacity is full, it will give up new requests, and then re declare that a thread periodically obtains one or more tasks from the task queue for execution, so as to realize the leaky bucket algorithm.

Above, we demonstrate that the control rate of nginx actually uses the leaky bucket algorithm. Of course, we can also easily implement the leaky bucket algorithm with the help of redis.

We can use the redis cell module provided in redis version 4.0, which uses the funnel algorithm and provides atomic current limiting instructions. Moreover, relying on redis, a natural distributed program, we can achieve perfect current limiting.

Redis cell's current limiting method is also very simple. You only need to use one instruction cl.throttle. The use example is as follows:

> cl.throttle mylimit 15 30 60
1)(integer)0 # 0 表示获取成功,1 表示拒绝
2)(integer)15 # 漏斗容量
3)(integer)14 # 漏斗剩余容量
4)(integer)-1 # 被拒绝之后,多长时间之后再试(单位:秒)-1 表示无需重试
5)(integer)2 # 多久之后漏斗完全空出来

Where 15 is the capacity of the funnel and 30 / 60s is the rate of the funnel.

In the token bucket algorithm, a program generates a token at a constant speed and stores it in the token bucket. Each request needs to obtain a token before execution. If the request does not obtain a token, you can choose to wait or give up execution, as shown in the following figure:

We can use Google's open source guava package to easily implement the token bucket algorithm. First, in POM Add guava reference to XML. The configuration is as follows:

<!-- https://mvnrepository.com/artifact/com.google.guava/guava -->
<dependency>
    <groupId>com.google.guava</groupId>
    <artifactId>guava</artifactId>
    <version>28.2-jre</version>
</dependency>

The specific implementation code is as follows:

import com.google.common.util.concurrent.RateLimiter;

import java.time.Instant;

/**
 * Guava 实现限流
 */
public class RateLimiterExample {
    public static void main(String[] args) {
        // 每秒产生 10 个令牌(每 100 ms 产生一个)
        RateLimiter rt = RateLimiter.create(10);
        for (int i = 0; i < 11; i++) {
            new Thread(() -> {
                // 获取 1 个令牌
                rt.acquire();
                System.out.println("正常执行方法,ts:" + Instant.Now());
            }).start();
        }
    }
}

The results of the above procedures are:

From the above results, we can see that one token is indeed generated every 100ms, and the acquire () method is blocking waiting to obtain a token. It can pass an int type parameter to specify the number of tokens. Its alternative method is tryacquire (), which returns false when no token is available, so that the wait will not be blocked. Of course, the tryacquire () method can also set the timeout. If the maximum waiting time is not exceeded, the waiting for a token will be blocked. If the maximum waiting time is exceeded, no available token will return false.

The content of this article comes from the network collection of netizens. It is used as a learning reference. The copyright belongs to the original author.
THE END
分享
二维码
< <上一篇
下一篇>>