How to improve the throughput of web applications

2020-12-12 • Java

Who restricted throughput?

When we stress test a traditional project, we find that the throughput of the system is limited by the database (MySQL). Although the code looks OK and the logic is correct, too many requests are directed to the database. The database starts a large number of IO operations. Such a large load will even make the overall load of the Linux system soar, But looking at the throughput of our system, ha ha

Focus on cache

Since the compressive capacity of MySQL limits our system, we should cache the data and try our best to reduce the number of direct contacts between users and the database, so that the throughput of our system and the number of requests that can be processed at the same time will naturally rise

There are many cache technologies on the market. The two popular cache databases are Memcache and redis,

Differences between redis and memcahe

Pressure test redis

# 挨个测试redis中的命令
# 每个数据包大小是3字节
# 100个并发,发起10万次请求
redis-benchmark -h 127.0.0.1 -p 6379 -c 100 -n 100000

[root@139 ~]# redis-benchmark -h 127.0.0.1 -p 9997 -c 100 -n 100000
====== PING_INLINE ======
  100000 requests completed in 1.04 seconds
  100 parallel clients
  3 bytes payload
  keep alive: 1

98.68% <= 1 milliseconds // 百分之98.68的请求在1毫秒内完成了
99.98% <= 2 milliseconds 
100.00% <= 2 milliseconds
96525.09 requests per second  // 每秒完成的请求数在9万六左右


-d  指定数据包的大小,看下面redis的性能还是很强大的
-q  简化输出的参数
[root@139 ~]# redis-benchmark -h 127.0.0.1 -p 9997 -q -d 100 -c 100 -n 100000
PING_INLINE: 98619.32 requests per second
PING_BULK: 95877.28 requests per second
SET: 96153.85 requests per second
GET: 95147.48 requests per second
INCR: 95238.10 requests per second
LPUSH: 95328.88 requests per second
RPUSH: 95877.28 requests per second
LPOP: 95328.88 requests per second
RPOP: 97276.27 requests per second
SADD: 96339.12 requests per second
HSET: 98231.83 requests per second
SPOP: 94607.38 requests per second
LPUSH (needed to benchmark LRANGE): 92165.90 requests per second
LRANGE_100 (first 100 elements): 97181.73 requests per second
LRANGE_300 (first 300 elements): 96153.85 requests per second
LRANGE_500 (first 450 elements): 94428.70 requests per second
LRANGE_600 (first 600 elements): 95969.28 requests per second
MSET (10 keys): 98231.83 requests per second

只测试 指定的命令
-t 跟多个命令参数
[root@139 ~]# redis-benchmark -p 9997 -t set,get -q -n 100000 -c 100 
SET: 97276.27 requests per second
GET: 98135.42 requests per second

From the above stress test, we can see that redis's performance is absolute strength and quite strong. It is not an order of magnitude compared with MySQL, so the conclusion is obvious. If we add a layer of redis to the user and MySQL for caching, the system throughput will naturally increase

Therefore, in order to improve the pressure resistance of the system, we gradually transferred the pressure from Mysql to redis

Page caching technology

Before talking about page caching, let's talk about the life cycle of a request in a traditional project: from the browser to the server, the server queries the database to obtain the results, and then passes the result data to the template engine to render the data into the HTML page

To improve the speed of this process, we can do this. Page caching, as the name suggests, is to cache HTML pages into the cache database

Examples are as follows:

At the beginning, we will try to get the rendered HTML source code response from the cache to the client. The format of the response is controlled by the properties in @ ResponseBody and products, telling the browser that it will return HTML text to it

Advantages: transfer the pressure of users' requests from Mysql to redis. This strength is not a problem for redis stand-alone

Disadvantages: obviously, if the request is expanded to the page level, the data consistency will inevitably be affected, which is also a point that must be considered when using page caching

Feature 1: strictly control the cache time, and don't forget to add the expiration time

Feature 2: We used to let thymeleaf automatically render the data. Now, it is obvious that we are rendering the data manually

    @RequestMapping(value = "/to_list",produces = "text/html;charset=UTF-8")
    @ResponseBody
    public String toLogin(Model model,User user,HttpServletResponse response,HttpServletRequest request) {

        // 先从redis缓存中获取数据
        String html = redisService.get(GoodsKey.goodsList,"",String.class);
        if (html != null)
            return html;

        // 查询商品列表
        List<GoodsVo> goodsList = goodsService.getGoodsList();
        model.addAttribute("goodsList",goodsList);

        // 使用Thymeleaf模板引擎手动渲染数据
        WebContext springWebContext = new WebContext(request,response,request.getServletContext(),request.getLocale(),model.asMap());
        String goods_list = thymeleafViewResolver.getTemplateEngine().process("goods_list",springWebContext);

        // 存入redis
        if (goods_list!=null){
            redisService.set(GoodsKey.goodsList,goods_list);
        }
        return goods_list;
    }

Now that we're all here, let's continue to talk about how to play

You see, the rendered HTML source code obtained by manually controlling the API of the template engine above. What is the rendered HTML source code? To put it bluntly, I originally wrote in the front end: th ${user}, which is a placeholder. Now it has been replaced by thymeleaf with Zhang San (say it directly enough)

After getting the rendered source code, we can write this file to a directory of the system through IO operation. I don't know if you have found it. When you browse a commodity page on Jingdong Taobao, you will find that the URL is similar to www.jjdd.com com/aguydg/ahdioa/1235345. html

This suffix is 123145 HTML probably shows that JD uses static page technology, which is too wise. In the face of such a huge amount of commodity information, it's good to express the suffix with numbers, and it's fast, isn't it?

How to achieve this effect?

As mentioned above, write the data of these source codes to a directory in Linux through io. The file name is the last number in the above URL. Use nginx as a static resource server to send these XXX HTML proxy up, users will go to this static page if they visit again, and they will not contact the database. Moreover, nginx also supports zero copy, and it is not a problem to have a concurrency of 50000 In addition, it's better not to write about the suffix array. It's better to use the commodity ID directly. After all, click the commodity to get the ID first, and then enter the static page

Object caching technology

Cache objects in Java, such as persisting the user's information into redis. Each time a user queries his own information, he first queries it from redis, and if not, he returns it directly. In this way, he can also add an additional layer of cache between the user and the database, which can also greatly improve the throughput of the system

How do you usually play?

For the user's request, try to obtain the object information from redis before querying the database. If redis does not exist, query the database. After querying the result, save the result in redis

// todo 使用redis做缓存,减少和数据库的接触次数
public Label findById(Long labelId) {

    // 先尝试从缓存中查询当前对象
    Label label = (Label) redistemplate.opsForValue().get("label_id" + labelId);

    if (label==null){
        Optional<Label> byId = labelRepository.findById(labelId);
        if (!byId.isPresent()) {
            // todo 异常
        }
        label = byId.get();

        // 将查出的结果存进缓存中
        redistemplate.opsForValue().set("label_id"+label.getId(),label);
    }
    return label;
}

When the user updates the data, first update the database, and then delete / update the response cache in redis

public void update(Long labelId,Label label) {
    label.setId(labelId);
    Label save = labelRepository.save(label);

    // todo 数据库修改成功后,将缓存删除
    redistemplate.delete("label_id"+save.getId());
    }

When users delete data, first delete the data in the database, and then delete the cache in redis

public void delete(Long labelId) {
    labelRepository.deleteById(labelId);

    // todo 数据库修改成功后,将缓存删除
    redistemplate.delete("label_id"+labelId);
}

Simulate Vue to realize page static

Everyone is talking about page static. Is it really so magical? In fact, it's not so magical. To put it bluntly, the data on the traditional web page is rendered through the template engine (such as JSP or thymeleaf). The rendering of the data in the static web page is completed through JS, and the web page and the static resources in the project such as JS and CSS are placed under a directory, which has the same status as ordinary static resources, Another advantage is the benefits given by the browser, because the browser has a cache for static resources. If you are good at observation, you will find that sometimes you repeatedly request a web page, and the web page displays normally, but the status code is 304 The request will still reach the server, but the server will tell the browser that the page it wants to visit has not changed, so the browser will find the local cache

At present, the hottest technology to play static pages in China is angular JS and Vue JS is really easy to use. I wrote notes about Vue a few months ago. Interested students can go and check my Vue notes by clicking

Vue is not used in this blog, but the idea of realizing the static page is not so bad as Vue. The static page is also realized through JS code

The front and back ends are separated. Naturally, it is JSON interaction. The back end is returned to the front-end JSON object through @ ResponseBody control. In addition, it is recommended that you also use a whole VO object to encapsulate all kinds of data and return it to the front end at one time. In this way, the back end is really simple, that is, to return a JSON object

The first thing is to move the HTML file from the template folder to the static folder, and make the HTML file and JS / CSS file brothers

Then give this XXX Why change the name of HTML to a different directory? Because the spring boot contract is greater than the code, the files are under the directory. In addition, the default configuration information of thymeleaf is as follows

@ConfigurationProperties(prefix = "spring.thymeleaf")
public class ThymeleafProperties {

	private static final Charset DEFAULT_ENCODING = StandardCharsets.UTF_8;

	public static final String DEFAULT_PREFIX = "classpath:/templates/";

	public static final String DEFAULT_SUFFIX = ".html";

I can't help it. By default, these configuration information thinks that the templates under the classpath are XXX HTML file

The second thing is to remove the namespaces similar to thymeleaf introduced in HTML tags, which are not required for static pages

<!DOCTYPE HTML>
<html xmlns:th="http://www.thymeleaf.org">
<head>
    <title>商品列表</title>
    <Meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
    <!-- jquery -->
    <!--<script type="text/javascript" th:src="@{/js/jquery.min.js}"></script>-->
    <script type="text/javascript" th:src="js/jquery.min.js"></script>

The third thing is to write an Ajax. As soon as the page is loaded, it will trigger a request to the backend to obtain the data, and operate each node through jQuery to complete the data rendering

After finishing the work in three steps, it doesn't seem very complicated, but our page has become a static page. From then on, it will be cached by the browser. As long as the page doesn't change, the browser will always use its own cache. How energetic the data transmission volume on the network is, and the system RT will definitely soar by several orders of magnitude

Optimization means of static resources

Let's talk about the common static resource optimization technologies on the market:

In addition, when multiple users modify the inventory concurrently, they change the inventory to a negative number. The database engine they use is InnoDB, which has row level locks. We just need to modify our SQL and add conditions and stock_ number > 0

In order to prevent the same user from sending two requests and occasionally killing multiple products, we go to Miaosha_ Create a unique index in the user table and create a unique index for the userid. The same userid is not allowed to appear twice, so as to avoid the above situation

Verification code technology

There are many advantages for users to enter the verification code. In addition to verifying the user's identity information, the most obvious advantage is to disperse the user's pressure on the system. If the front-end does not add the picture verification code, the system may need to carry 10000 concurrent within 1s, but adding the picture verification code can disperse the 10000 concurrent within 10 seconds or even more

There are really a lot of technologies like how to generate picture verification codes. I won't post the code. Interested students Baidu by themselves, piece by piece

Because this image is a static resource, if you do not disable caching, this image will be cached. If you want to change the verification code every time you click on the image, refer to the JS implementation below and add a timestamp

   function refreshImageCode() {
        $("#verifyCodeImg").attr("src","/path/verifyCode?goodsId="+$("#goodsId").val()+"&timestamp="+new Date());
    }

Interface current limiting technology

For example, we want to limit the number of times that a single user can access method a in a controller to no more than 30 times in one minute. This is actually a requirement of interface current limiting, which can effectively prevent malicious access by users

In fact, it is not difficult to combine this with caching. For example, I use the above example: don't you want to limit the flow of method a? Let's add the following logic to the a method

The pseudo code is as follows:

public void a(User user){
    // 校验user合法性
    // 限流
   Integer count = redis.get(user.getId());
    if(count!=null&&count>15)
      return ; // 到达了指定的阕值,直接返回不允许继续访问
    if(count==null){
        redis.set(user.getId(),30,1); // 表示当前用户访问了1次,当前key的有效时间为30s
    }else{
	    redis.incr(user.getId());
    }
}

We can use the interceptor technology. If we rewrite the interceptor's prehandler () method, it will call back before executing the method in the controller, and cooperate with the user-defined annotation technology. The following is simply for

Example:

@Component
public class AccessIntercepter extends HandlerInterceptorAdapter {
    // 在方法执行之前进行拦截
    @Override
    public boolean preHandle(HttpServletRequest request,Object handler) throws Exception {
        if (handler instanceof HandlerMethod){
            HandlerMethod hd = (HandlerMethod) handler;
            LimitAccess methodAnnotation = hd.getmethodAnnotation(LimitAccess.class);
            if (methodAnnotation==null)
                return true;

            // 解析注解
            int maxCount = methodAnnotation.maxCount();
            boolean needLogin = methodAnnotation.needLogin();
            int second = methodAnnotation.second();
            // todo
        }
        return true;
    }
}

Conclusion: it's exam week again recently. This Saturday, next Wednesday, I will take the exam of operations research I hope I can spend it safely

The content of this article comes from the network collection of netizens. It is used as a learning reference. The copyright belongs to the original author.

THE END

Java

二维码

Java concurrent queues and containers

< <上一篇

Android test point sorting

下一篇>>

搜索内容