Java garbage collector and memory issues
I have a very strange problem with Java applications
In essence, it is a web page using Mulan (CMS system). There are four examples available in the production environment Sometimes the CPU reaches 100% in a java process
Therefore, the first method is to do a thread dump and check the offending threads. I find it strange:
"GC task thread#0 (ParallelGC)" prio=10 tid=0x000000000ce37800 nid=0x7dcb runnable "GC task thread#1 (ParallelGC)" prio=10 tid=0x000000000ce39000 nid=0x7dcc runnable
Well, it's strange. I've never had a problem with a garbage collector like this, so the next thing we do is activate JMX and check the machine using jvisualvm: heap memory usage is really high (95%)
Naive approach: increase memory, so the problem takes more time to display. As a result, increase memory on the restarted server (6 GB!) Restart 20 hours after the problem occurs, while other servers have less memory (4GB!) It has been running for 10 days, and it will take a few days for the problem to reappear In addition, my attempt to use Apache in the server to access the log failed, and I used JMeter to replay the request to the local server in attemp to reproduce the error... It didn't work either
Then I investigated some more logs to find this error
info.magnolia.module.data.importer.ImportException: Error while importing with handler [brightcoveplaylist]:GC overhead limit exceeded at info.magnolia.module.data.importer.ImportHandler.execute(ImportHandler.java:464) at info.magnolia.module.data.commands.ImportCommand.execute(ImportCommand.java:83) at info.magnolia.commands.MgnlCommand.executePooledOrSynchronized(MgnlCommand.java:174) at info.magnolia.commands.MgnlCommand.execute(MgnlCommand.java:161) at info.magnolia.module.scheduler.CommandJob.execute(CommandJob.java:91) at org.quartz.core.JobRunShell.run(JobRunShell.java:216) at org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.java:549) Caused by: java.lang.OutOfMemoryError: GC overhead limit exceeded
Another example
Caused by: java.lang.OutOfMemoryError: GC overhead limit exceeded at java.util.Arrays.copyOf(Arrays.java:2894) at java.lang.AbstractStringBuilder.expandCapacity(AbstractStringBuilder.java:117) at java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:407) at java.lang.StringBuilder.append(StringBuilder.java:136) at java.lang.StackTraceElement.toString(StackTraceElement.java:175) at java.lang.String.valueOf(String.java:2838) at java.lang.StringBuilder.append(StringBuilder.java:132) at java.lang.Throwable.printStackTrace(Throwable.java:529) at org.apache.log4j.DefaultThrowableRenderer.render(DefaultThrowableRenderer.java:60) at org.apache.log4j.spi.ThrowableInformation.getThrowableStrRep(ThrowableInformation.java:87) at org.apache.log4j.spi.LoggingEvent.getThrowableStrRep(LoggingEvent.java:413) at org.apache.log4j.AsyncAppender.append(AsyncAppender.java:162) at org.apache.log4j.AppenderSkeleton.doAppend(AppenderSkeleton.java:251) at org.apache.log4j.helpers.AppenderAttachableImpl.appendLoopOnAppenders(AppenderAttachableImpl.java:66) at org.apache.log4j.Category.callAppenders(Category.java:206) at org.apache.log4j.Category.forcedLog(Category.java:391) at org.apache.log4j.Category.log(Category.java:856) at org.slf4j.impl.Log4jLoggerAdapter.error(Log4jLoggerAdapter.java:576) at info.magnolia.module.templatingkit.functions.STKTemplatingFunctions.getReferencedContent(STKTemplatingFunctions.java:417) at info.magnolia.module.templatingkit.templates.components.InternalLinkModel.getLinkNode(InternalLinkModel.java:90) at info.magnolia.module.templatingkit.templates.components.InternalLinkModel.getLink(InternalLinkModel.java:66) at sun.reflect.GeneratedMethodAccessor174.invoke(UnkNown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:622) at freemarker.ext.beans.BeansWrapper.invokeMethod(BeansWrapper.java:866) at freemarker.ext.beans.BeanModel.invokeThroughDescriptor(BeanModel.java:277) at freemarker.ext.beans.BeanModel.get(BeanModel.java:184) at freemarker.core.Dot._getAstemplateModel(Dot.java:76) at freemarker.core.Expression.getAstemplateModel(Expression.java:89) at freemarker.core.BuiltIn$existsBI._getAstemplateModel(BuiltIn.java:709) at freemarker.core.BuiltIn$existsBI.isTrue(BuiltIn.java:720) at freemarker.core.OrExpression.isTrue(OrExpression.java:68)
Then I found that such problem is due to the garbage collector using a ton of CPU but not able to free much memory
OK, so this is the memory problem displayed in the CPU, so if the memory usage problem is solved, the CPU should be good, so I took a heap dump. Unfortunately, it is too large. Open it (the file is 10GB). Anyway, I run the server to locally load it a little and take a heapdump. After opening it, I found some interesting things:
There is an instance of ton
AbstractReferenceMap$WeakRef ==> Takes 21.6% of the memory,9 million instances AbstractReferenceMap$ReferenceEntry ==> Takes 9.6% of the memory,3 million instances
In addition, I have found a map that seems to be used as a "cache" (terrible but real). The problem is that such a map is not synchronized, but shared (static) between threads. The problem may not only be concurrent writing, but also due to the lack of synchronization, it is not guaranteed that thread a can see the changes made by thread B to the map, But I can't figure out how to use the memory eclipse analyzer to link this suspicious map, because it doesn't use abstractreferencemap, it's just a normal HashMap
Unfortunately, we don't use these classes directly (obviously the code uses them, not directly), so I seem to be dead
My question is
>I can't reproduce the error > I can't figure out where the memory leak is (if so)
Any ideas?
Solution
The "no OP" finalize() methods must be removed because they may make GC performance problems worse But I suspect you have other memory leaks
Recommendations:
>First, get rid of the useless finalize() method. > If you have other finalize () methods, consider removing them (it's generally a bad idea to do things according to the final draft...) > use the memory analyzer to try to identify the leaked object and the cause of the leak There are many so problems... And other resources to find java code leaks For example:
>How to find a JAVA memory leak > java se 6 troubleshooting guide using hotspot VM, Chapter 3
Now your special symptoms
First, where outofmemoryerrors are thrown may not matter
However, the fact that you have a large number of abstractreferencemap $weakref and abstractreferencemap $referenceentry objects is a string indicating that your application or the library it is using is performing a large amount of caching, and caching involves problems (abstractreferencemap class is a part of Apache commons collections library. It is the superclass of referencemap and referenceidentitymap.)
You need to track the map objects (or objects) to which the weakref and referenceentry objects belong and the (target) objects they reference Then you need to figure out what is creating them / them and find out why entries are not cleared in response to high memory requirements
>Do you have strong references to target objects elsewhere (this will prevent weakrefs from being destroyed)? > Yes / Yes, the map in use is incorrect, resulting in leakage (read JavaDocs carefully...) > are these maps used by multiple threads without external synchronization? This can lead to corruption, which can lead to huge storage leaks
Unfortunately, these are just theories and may be other things In fact, as you can imagine, this is not a memory leak at all
Finally, your observation is that the problem is worse when the heap is larger For me, this problem with reference / caching is still consistent
>Reference objects are more suitable for GC than regular references. > This creates more work when the GC needs to "break" a reference; For example, processing reference queues. > Even if this happens, the resulting unreachable objects cannot be collected until the next GC cycle
Therefore, I can see that a 6GB heap full of references greatly increases the percentage of time spent in GC compared with 4GB heap, which may lead to the early start of the "GC overhead limit" mechanism
But I think this is an accidental symptom, not the root cause