In depth understanding of Java G1 garbage collector — to

Original address: http://blog.jobbole.com/109170/?utm_source=hao.jobbole.com&utm_medium=relatedArticle

This paper first briefly introduces the common methods of garbage collection, then analyzes the collection principle of G1 collector, its advantages over other garbage collectors, and finally gives some tuning practices.

1、 What is garbage collection

First of all, before understanding G1, we need to know clearly what garbage collection is? In short, garbage collection is to recycle objects that are no longer used in memory.

Basic steps of garbage collection

There are 2 steps for recycling:

1. Find objects that are no longer used in memory

So the question is, how to judge which objects are no longer used? We also have two methods:

Reference counting is that if an object is not pointed to by any reference, it can be regarded as garbage. The disadvantage of this method is that the existence of rings cannot be detected.

2. Root search algorithm

The basic idea of the root search algorithm is to use a series of objects named "GC roots" as the starting point to start the downward search from these nodes. The search path is called reference chain. When an object is not connected to GC roots by any reference chain, it is proved that the object is unavailable.

Now we know how to find garbage objects and how to clean them up?

2. Release the memory occupied by these objects

The common methods are replication or direct cleaning, but direct cleaning will have memory fragments, so the method of cleaning and recompression will occur.

In general, there are three types of recycling algorithms.

1. Mark - copy

It divides the available memory capacity into two equal sized blocks, using only one of them at a time. When this block is used up, the surviving objects are copied to another block, and then the used memory space is disposed at one time. It has the advantages of simple implementation, high efficiency and no memory fragments. The disadvantage is that it requires twice the memory to manage.

2. Marking - cleaning

The mark clearing algorithm is divided into two stages: mark and clear: first mark the objects to be recycled, and then uniformly clear the objects after marking. Its advantage is high efficiency, but its disadvantage is that it is easy to produce memory fragments.

3. Marking - sorting

The marking operation is consistent with the "marking cleaning" algorithm. The subsequent operation is not just to directly clean up objects, but to move all living objects to one end after cleaning up useless objects, and update the pointer referencing their objects. Because the object is moved, it is less efficient than mark clean, but it will not produce memory fragments.

Generation based assumption

Because the survival time of objects is long and short, for objects with long survival time, reducing the number of GC can avoid unnecessary overhead. In this way, we divide the memory into the new generation and the old generation. The new generation stores newly created objects with short survival time, and the old generation stores objects with long survival time. In this way, only the young generation is cleaned each time, and the old generation is cleaned only when necessary, which can greatly improve GC efficiency and save GC time.

History of Java garbage collector

The first stage is the serial collector

In jdk1 Before 3.1, Java virtual machines could only use serial collectors. The serial collector is a single threaded collector, but its meaning of "single thread" does not only mean that it will only use one CPU or one collection thread to complete garbage collection. More importantly, when it performs garbage collection, it must pause all other working threads until the collection is completed.

PS: how to turn on the serial collector

The second stage is the parallel collector

Parallel collector is also called throughput collector. Compared with serial collector, the main advantage of parallel collector is to use multithreading to complete garbage cleaning, which can make full use of the characteristics of multi-core and greatly reduce GC time.

PS: how to start parallel collector

The third stage is the CMS (concurrent) collector

CMS collector pauses all application threads during minor GC and performs garbage collection in a multithreaded manner. During full GC, the application thread is no longer suspended, but several background threads are used to regularly scan the older generation space to recover the objects that are no longer used in time.

PS: how to open CMS collector

The fourth stage, G1 (concurrent) collector

The G1 collector (or garbage priority collector) is originally designed to minimize the pause when handling large piles (greater than 4GB). Compared with CMS, the advantage is that the generation rate of memory fragments is greatly reduced.

PS: mode of opening G1 collector

2、 Understanding G1

The first paper (Appendix 1) of G1 was published in 2004 and was only available in jdk1.7u4 in 2012. Oracle officially plans to turn G1 into the default garbage collector in jdk9 to replace CMS. Why should Oracle strongly recommend G1 and what are the advantages of G1?

First, the design principle of G1 is simple and feasible performance tuning

Developers only need to declare the following parameters:

Where - XX: + useg1gc is to enable G1 garbage collector, the maximum memory of - xmx32g design heap memory is 32g, and - XX: maxgcpausemillis = 200. Set the maximum pause time of GC to 200ms. If we need to tune, we only need to modify the maximum pause time when the memory size is certain.

Secondly, G1 cancels the physical space division of the Cenozoic and the old.

In this way, we no longer need to set each generation in a separate space, and we don't have to worry about whether each generation has enough memory.

Instead, G1 algorithm divides the heap into several regions (region), which still belongs to the generational collector. However, some of these regions contain the new generation. The garbage collection of the new generation still copies the living objects to the old age or survivor space by pausing all application threads. The old generation is also divided into many regions. The G1 collector completes the cleaning by copying the objects from one region to another Make. This means that G1 completes heap compression (at least partial heap compression) during normal processing, so that there will be no CMS memory fragmentation problem.

In G1, there is also a special area called humongous area. If an object occupies more than 50% of the partition capacity, the G1 collector considers it a giant object. These giant objects will be directly allocated to the old generation by default, but if it is a short-term giant object, it will have a negative impact on the garbage collector. In order to solve this problem, G1 divides a humongous area, which is used to store giant objects. If an H-zone cannot hold a giant object, G1 will look for continuous h-zones to store it. In order to find a continuous H-zone, sometimes you have to start the full GC.

PS: in Java 8, the persistent generation is also moved to the ordinary heap memory space and changed to meta space.

Object allocation policy

Speaking of the allocation of large objects, we have to talk about the allocation strategy of objects. It is divided into three stages:

TLAB allocates buffers locally for threads. Its purpose is to allocate objects as quickly as possible. If objects are allocated in a shared space, we need to adopt some synchronization mechanisms to manage the free space pointers in these spaces. In Eden space, each thread has a fixed partition for allocating objects, that is, a TLAB. When allocating objects, there is no need for any synchronization between threads.

For objects that cannot be allocated in TLAB space, the JVM will try to allocate in Eden space. If Eden space cannot accommodate the object, space can only be allocated among older generations.

Finally, G1 provides two GC modes, young GC and mixed GC, both of which are stop the world (STW). Next, we will introduce these two modes respectively.

3、 G1 young GC

Young GC is mainly used to GC Eden area, which will be triggered when Eden space is exhausted. In this case, the data in Eden space is moved to survivor space. If the survivor space is insufficient, part of the data in Eden space will be directly promoted to the older generation space. The data in the survivor area is moved to the new survivor area, and some data are promoted to the old age space. Finally, the data in Eden space is empty, the GC stops working and the application thread continues to execute.

At this time, we need to consider a problem. If we only GC new generation objects, how can we find all root objects? Are all objects of the old age roots? It will take a lot of time to scan. Therefore, G1 introduces the concept of RSET. Its full name is remembered set. It is used to track object references pointing to a heap area.

In CMS, there is also the concept of RSET. In the old generation, there is an area to record references to the new generation. This is a point out. When performing young GC, when scanning the root, you only need to scan this area, not the whole elderly generation.

However, in G1, point out is not used because a partition is too small and the number of partitions is too large. If point out is used, it will cause a lot of scanning waste. Some partition references that do not need GC are also scanned. Therefore, point in is used in G1. Point in means which partitions refer to objects in the current partition. In this way, only scanning these objects as roots avoids invalid scanning. Since there are multiple Cenozoic generations, do we need to record references between Cenozoic generations? This is unnecessary because all the new generations will be scanned every time GC, so only the references from the old generation to the new generation need to be recorded.

It should be noted that if there are many referenced objects, the evaluator needs to process each reference, and the cost of the evaluator will be large. In order to solve the problem of evaluator cost, another concept is introduced in G1, Card table. A card table logically divides a partition into continuous areas of fixed size. Each area is called a card. Cards are often small, ranging from 128 to 512 bytes. Card tables are usually byte arrays, indexed by cards (i.e. array subscript) to identify the space address of each partition. By default, each card is not referenced. When an address space is referenced, the value of the array index corresponding to the address space is marked as "0", that is, it is marked as dirty and referenced. In addition, RSET also records the array subscript. Generally, this RSET is actually a hash table, and the key is not The starting address of the region, value is a collection, and the element in it is the index of the card table.

Young GC stage:

4、 G1 mix GC

Mix GC not only performs normal new generation garbage collection, but also recycles some old age partitions marked by background scanning threads.

Its GC steps are divided into 2 steps:

Before mix GC, global concurrent marking will be performed. What is the execution process of global concurrent marking?

In G1 GC, it mainly provides tag services for mixed GC, not a necessary part of a GC process. The execution process of global concurrent marking is divided into five steps:

Tri-color Marking Algorithm

When it comes to concurrent tagging, we have to understand the three color tagging algorithm of concurrent tagging. It is a useful method to describe the tracking collector, and the correctness of the collector can be deduced by using it. First, we divide objects into three types.

When GC starts scanning objects, follow the steps below to scan objects:

The root object is set to black and the child object is set to gray.

Continue traversing by gray, and set the objects that have scanned sub objects to black.

After traversing all reachable objects, all reachable objects turn black. Unreachable objects are white and need to be cleaned up.

This looks nice, but if the application is running during the marking process, the pointer to the object may change. In this case, we will encounter a problem: object loss

Let's look at the following case. When the garbage collector scans the following cases:

At this time, the application performs the following operations:

In this way, the state diagram of the object becomes as follows:

At this time, the garbage collector will mark and scan as follows:

Obviously, at this time, C is white, which is considered as garbage and needs to be cleaned up. Obviously, this is unreasonable. So how can we ensure that the GC marked objects are not lost when the application is running? There are 2 feasible ways:

This just corresponds to two different implementations of CMS and G1:

In CMS, incremental update is adopted. As long as it is found in the write barrier that the reference of a white object is assigned to the field of a black object, the white object will be grayed out, that is, recorded at the time of insertion.

In G1, stab (snapshot at the beginning) is used to record all objects when deleting. It has three steps:

1. A snapshot icon is generated at the beginning of marking to mark the surviving object

2. During concurrent marking, all changed objects are queued (all objects pointed to by old references are made non white in the write barrier)

3. There may be free garbage, which will be collected next time

In this way, G1 can now know which old partitions can recycle the most garbage. When the global concurrency flag is completed, mix GC starts at a certain time. These garbage collections are called "hybrid" because they not only perform normal new generation garbage collection, but also recycle some partitions marked by background scanning threads. The mixed garbage collection is shown in the figure below:

Hybrid GC also adopts the clean-up strategy of replication. When the GC is completed, it will free up space again.

So far, the hybrid GC has come to an end. In the next section, we'll move on to tuning practice.

5、 Tuning practice

Maxgcpausemillis tuning

The most basic parameters for using GC have been described earlier:

The first two parameters are easy to understand. How to configure the latter maxgcpausemillis parameter? This parameter literally means the maximum pause time allowed for GC. G1 try to ensure that the time of each GC pause is within the set maxgcpausemillis range. How does G1 achieve the maximum pause time? This involves another concept, CSET (collection set). It means a collection of areas collected in a garbage collector.

After understanding this, we can set the maximum pause time. First of all, there is a limit to the maximum pause time we can tolerate. We need to set it within this limit. But what is the value that should be set? We need to make a balance between throughput and maxgcpausemillis. If maxgcpausemillis is set too small, GC will occur frequently and throughput will decrease. If maxgcpausemillis is set too large, the application pause time will become longer. The default pause time of G1 is 200ms. We can start here and adjust the appropriate time.

Other tuning parameters

-XX:G1HeapRegionSize=n

Sets the size of the G1 area. The value is a power of 2, ranging from 1 MB to 32 MB. The goal is to divide about 2048 regions according to the minimum Java heap size.

-XX:ParallelGCThreads=n

Sets the value for the number of STW worker threads. Set the value of n to the number of logical processors. The value of n is the same as the number of logical processors, up to 8.

If there are more than eight logical processors, set the value of n to about 5 / 8 of the number of logical processors. This applies in most cases, except for larger SPARC systems, where the value of N can be about 5 / 16 of the number of logical processors.

-XX:ConcGCThreads=n

Sets the number of threads for the parallel tag. Set n to about 1 / 4 of the number of parallel GC threads.

-XX:InitiatingHeapOccupancyPercent=45

Sets the Java heap occupancy threshold that triggers the tag cycle. The default occupancy rate is 45% of the entire Java heap.

Avoid using the following parameters:

Avoid explicitly setting the younger generation size with the - XMN option or other related options such as - XX: newratio. Fixing the size of the younger generation will override the pause time target.

Trigger full GC

In some cases, G1 triggers full GC. At this time, G1 will degenerate to use the serial collector to clean up garbage. It only uses a single thread to complete GC work, and the GC pause time will reach the level of seconds. The whole application is in a suspended state and can't process any requests. Of course, our program doesn't want to see these. So what happens to full GC?

G1 starts the marking cycle, but the old age is filled before mix GC. At this time, G1 will give up the marking cycle. In this case, you need to increase the heap size or adjust the cycle (for example, increase the number of threads - XX: concgcthreads, etc.).

G1 does not have enough memory for survival objects or promotion objects during GC, which triggers full GC. You can see (to space exhausted) or (to space overflow) in the log. The ways to solve this problem are:

a. Increase the value of - XX: g1reservepercent option (and increase the total heap size accordingly), and increase the reserved memory for the "target space".

b. Start the marking cycle in advance by reducing - XX: initiatinghepoccupancypercent.

c. You can also increase the number of parallel marker threads by increasing the value of the - XX: concgcthreads option.

When the giant object cannot find a suitable space for allocation, it will start full GC to free up space. In this case, you should avoid allocating a large number of giant objects, increase memory or increase - XX: g1heapregionsize, so that the giant objects are no longer giant objects.

Due to the limited space, there are many tuning practices in G1, which will not be listed here one by one. You can explore them slowly in ordinary practice. Finally, I look forward to the official release of Java 9. Will the Java performance of using G1 as the garbage collector by default be improved?

Appendix:

(1),

The content of this article comes from the network collection of netizens. It is used as a learning reference. The copyright belongs to the original author.
THE END
分享
二维码
< <上一篇
下一篇>>