On some basic concepts of GC garbage collector in Java

1、 Basic recycling algorithm

1. Reference counting is an old recycling algorithm. The principle is that this object has a reference, that is, increase a count, and delete a reference, reduce a count. During garbage collection, only objects with a count of 0 are collected. The most fatal problem of this algorithm is that it cannot handle circular references. 2. Mark clear (Mark sweep) this algorithm is implemented in two stages. In the first stage, all referenced objects are marked from the reference root node. In the second stage, the whole heap is traversed to clear the unmarked objects. This algorithm needs to pause the whole application and generate memory fragments. 3. Replication (copying) this algorithm divides the memory space into two equal areas and uses only one area at a time. During garbage collection, it traverses the current area and copies the objects in use to another area. The sub algorithm only processes the objects in use each time, so the replication cost is relatively small. At the same time, it can also carry out corresponding memory collation after replication However, there is a "fragmentation" problem. Of course, the disadvantage of this algorithm is also obvious, that is, it needs twice the memory space. 4. Mark compact this algorithm combines the advantages of "mark clear" and "copy" algorithms. It is also divided into two stages. In the first stage, all referenced objects are marked from the root node, and in the second stage, the whole heap is traversed, the unmarked objects are cleared, and the living objects are "compressed" into one piece of the heap and arranged in order. This algorithm avoids The fragmentation problem of "mark clear" also avoids the space problem of "copy" algorithm. 5. Incremental collection implements the garbage collection algorithm, that is, garbage collection is performed at the same time as the application. I don't know why the collector in jdk5.0 doesn't use this algorithm. 6. Generational collection

Based on the garbage collection algorithm obtained from the analysis of object life cycle. Objects are divided into young generation, old generation and persistent generation. Objects with different life cycles are recycled using different algorithms (one of the above methods). Current garbage collectors (starting from j2se1.2) use this algorithm.

1. The young generation is divided into three areas. One Eden area and two survivor areas. Most objects are generated in the Eden area. When the Eden area is full, the surviving objects will be copied to the survivor area (one of the two), when the survivor area is full, the living objects in this area will be copied to another survivor area. When the survivor area is full, the objects copied from the first survivor area and still alive at this time will be copied to the "aged area". It should be noted that the two areas of the survivor are symmetrical and have no sequence, so there may be objects copied from Eden and objects copied from the previous survivor in the same area at the same time, and only objects copied from the first survivor in the old area. Moreover, one of the survivor areas is always empty. 2. Tenured (old generation) the old generation stores objects that survive from the young generation. Generally speaking, the old generation stores objects with a long life. 3. Perm (persistent generation) is used to store static files, such as Java classes and methods. Persistent generation has no significant impact on garbage collection, but some applications may dynamically generate or call some classes, such as hibernate. At this time, a large persistent generation space needs to be set to store these new classes during operation. The persistent generation size is entered through - XX: maxpermsize = < n > Row settings.

2、 There are two types of GC: scavenge GC and full GC.

1. Scavenge GC generally, when a new object is generated and Eden fails to apply for space, the scavenge GC is triggered, the Eden area of the heap is GC, the non surviving objects are cleared, and the surviving objects are moved to the survivor area. Then sort out the two areas of survivor. 2. Full GC arranges the whole heap, including young, tenured and perm. Full GC is slower than scavenge GC, so you should reduce full GC as much as possible. Full GC may be caused by the following reasons: * tenured is full * perm field is full * system GC () is displayed. After calling * the last GC, the domain allocation policies of heap change dynamically

Generation garbage collection process demonstration

1.

two

three

four

2、 Garbage collector

At present, there are three kinds of collectors: serial collector, parallel collector and concurrent collector.

1. Serial collector

Using a single thread to handle all garbage collection work is more efficient because there is no need for multi-threaded interaction. However, the advantages of multiprocessors cannot be used, so this collector is suitable for single processor machines. Of course, this collector can also be used on multiprocessor machines with small data volume (about 100m). It can be opened with - XX: + useserialgc.

2. Parallel collector 1 Parallel garbage collection for younger generations can reduce garbage collection time. It is generally used on multithreaded and multiprocessor machines. Use - XX: + useparallelgc Open. Parallel collector in j2se5 0 introduced on the sixth 6 update, in Java se6 0 has been enhanced to allow parallel collection of older generations. If the older generation does not use concurrent collection, it uses single thread for garbage collection, which will restrict the expansion ability. Open with - XX: + useparalleloldgc. 2. Use - XX: parallelgcthreads = < n > to set the number of threads for parallel garbage collection. This value can be set equal to the number of machine processors. 3. This collector can be configured as follows: * maximum garbage collection pause: specify the maximum pause time during garbage collection, which is specified by - XX: maxgcpausemillis = < n >< N> Is milliseconds If this value is specified, the heap size and garbage collection related parameters are adjusted to reach the specified value. Setting this value may reduce the throughput of the application* Throughput: throughput is the ratio of garbage collection time to non garbage collection time. It is set by - XX: gctimeratio = < n > and the formula is 1 / (1 + n). For example, when - XX: gctimeratio = 19, it means that 5% of the time is used for garbage collection. The default is 99, that is, 1% of the time is used for garbage collection.

3. The concurrency collector can ensure that most work is carried out concurrently (the application does not stop). Garbage collection only pauses for a little time. This collector is suitable for medium and large-scale applications with high response time requirements. Use - XX: + useconcmarksweepgc to open it. 1. The concurrent collector mainly reduces the pause time of young and old generations. When the application does not stop, it uses an independent garbage collection thread to track reachable objects. Garbage collection in each old generation During the collection cycle, the concurrent collector will briefly pause the whole application at the beginning of collection, and will pause again during collection. The second pause will be slightly longer than the first, during which multiple threads perform garbage collection at the same time. 2. The concurrent collector uses the processor in exchange for a short pause time. On a system with N processors, the concurrent collection part uses K / N available processors for recycling. Generally, 1 < = k < = n / 4. 3. Using a concurrent collector on a host with only one processor and setting it to incremental mode can also obtain a shorter pause time. 4. Floating garbage: since garbage collection is performed while the application is running, some garbage may be generated when the garbage collection is completed, resulting in "floating garbage". These garbage can only be collected in the next garbage collection cycle. Therefore, concurrent collectors generally need 20% of the reserved space for these floating garbage. 5. Concurrent mode failure: the concurrent collector collects data when the application is running, so it is necessary to ensure that there is enough space for the program to use during the garbage collection period. Otherwise, the garbage collection is not completed, and the heap space is full first. In this case, "concurrent mode failure" will occur. At this time, the whole application will be suspended for garbage collection. 6. Start the concurrent collector: because concurrent collection is collected during application running, you must ensure that there is enough memory space for the program to use before the collection is completed, otherwise "concurrent mode failure" will appear. Start concurrent collection by setting - XX: cmsinitiatingoccupancyfraction = < n > to specify how many heaps remain

4. Summary * serial processor: - Application: the amount of data is relatively small (about 100m); applications with single processor and no requirements for response time. -- disadvantages: it can only be used for small applications * parallel processor: - Application: "high requirements for throughput" , medium and large-scale applications with multiple CPUs and no requirements for application response time. Examples: background processing, scientific computing-- Disadvantages: the application response time may be long * concurrent processor: - Application: "high requirements for response time", multi CPU, medium and large-scale applications with high requirements for application response time. Examples: Web server / application server, telecom exchange, integrated development environment.

3、 GC basic principle GC (garbage collection) is Java / Net. Java is developed from C + +. It discards some cumbersome and error prone things in C + +, and introduces the concept of counter, One of them is the GC mechanism (c# learn from Java) where programmers are prone to problems, forgetting or wrong memory recycling will lead to instability or even crash of the program or system. The GC function provided by java can automatically monitor whether the object exceeds the scope, so as to achieve the purpose of automatic memory recycling. The Java language does not provide a display operation method to release the allocated memory. Therefore, the memory management of Java Management is actually the management of objects, including the allocation and release of objects. For programmers, the allocation object uses the new keyword; When releasing an object, as long as all references of the object are assigned null, so that the program can no longer access the object. We call the object "unreachable" GC will be responsible for reclaiming the memory space of all "unreachable" objects. For GC, when the programmer creates an object, GC starts to monitor the address, size and usage of the object. Usually, GC records and manages all objects in the heap by means of directed graph. In this way, it determines which objects are "reachable" and which objects are "unreachable". When GC determines that some objects are "unreachable" The GC is responsible for reclaiming this memory space. However, in order to ensure that GC can be implemented on different platforms, many behaviors of GC are not strictly regulated in Java specification. For example, there are no clear regulations on the type of recycling algorithm and when to recycle. Therefore, implementers of different JVMs often have different implementation algorithms. This also brings many uncertainties to the development of Java programmers. This paper studies several problems related to GC work, and tries to reduce the negative impact of this uncertainty on Java programs.

4、 In the GC generational JVM memory model, heap is divided into two blocks: young generation and old generation

1) In young generation, there is a space called Eden space, which is mainly used to store new objects, There are also two survivor spaces (from, to), which are always the same size. They are used to store the objects that survive each garbage collection. 2) in the old generation, they mainly store the memory objects with a long life cycle in the application. 3) in the young generation block, the garbage collection generally uses the copying algorithm, which is fast. During each GC, the surviving objects are first copied by Eden For a survivorspace, when the survivorspace is full, the remaining live objects are directly copied to oldgeneration. Therefore, Eden memory blocks are emptied after each GC. 4) In the old generation block, mark compact algorithm is generally used for garbage collection, which is slower, but reduces memory requirements. 5) Garbage collection is divided into multiple levels. Level 0 is full garbage collection, which will recycle the garbage in the old section; Level 1 or above is partial garbage collection, which only recycles the garbage in young. Memory overflow usually occurs when there is still no memory space to accommodate new Java objects after the garbage collection of old segment or perm segment.

V Incremental GC incremental GC (incremental GC) is usually implemented by one or a group of processes in the JVM. It also occupies the heap space and CPU when running like the user program. When the GC process runs, the application stops running. Therefore, when the GC runs for a long time, the user can feel the pause of the Java program. On the other hand, if the GC running time is too short, it may affect the Java program For example, the recovery rate is too low, which means that many objects that should be recycled have not been recycled and still occupy a lot of memory. Therefore, when designing GC, a trade-off must be made between pause time and recovery rate. A good GC implementation allows users to define the settings they need. For example, some devices with limited memory are very sensitive to the use of memory. It is hoped that GC can accurately reclaim memory. It does not care about the speed of the program. In addition, some real-time online games cannot allow the program to be interrupted for a long time. Incremental GC is to divide a long-time interrupt into many small interrupts through a certain recovery algorithm, so as to reduce the impact of GC on user programs. Although incremental GC may not be as efficient as ordinary GC in overall performance, it can reduce the maximum pause time of the program. The hotspot JVM provided by sun JDK can support incremental GC. The default GC mode of hotspot JVM is not to use incremental GC. In order to start incremental GC, we must add the - xincgc parameter when running Java programs. The implementation of incremental GC in hotspot JVM adopts train GC algorithm, Its basic idea is to group (layer) all objects in the heap according to their creation and usage, put the objects that are frequently used and highly relevant in a team, and constantly adjust the group as the program runs. When GC runs, it always reclaims the oldest objects first If the whole group of objects (rarely accessed recently) are recyclable objects, GC will recycle the whole group. In this way, only a certain proportion of unreachable objects will be recycled each time GC runs to ensure the smooth operation of the program.

The content of this article comes from the network collection of netizens. It is used as a learning reference. The copyright belongs to the original author.
THE END
分享
二维码
< <上一篇
下一篇>>