Java heap analysis and OQL: counting unique strings
I do memory analysis on the existing Java software There is an SQL 'group by' equivalent in OQL to view the count of objects with the same value but different instances
Select count (*) from Java lang.String s group by s.toString()
I want to implement a list of duplicate strings and the number of duplicates The purpose of this is to look at a large number of situations so that you can use string Intern() for optimization
Example:
"foo" 100 "bar" 99 "lazy fox" 50
Wait
Solution
The following answers are based on Peter dolberg and can be used in the visualvm OQL console:
var counts={}; var alreadyReturned={}; filter( sort( map(heap.objects("java.lang.String"),function(heapString){ if( ! counts[heapString.toString()]){ counts[heapString.toString()] = 1; } else { counts[heapString.toString()] = counts[heapString.toString()] + 1; } return { string:heapString.toString(),count:counts[heapString.toString()]}; }),'lhs.count < rhs.count'),function(countObject) { if( ! alreadyReturned[countObject.string]){ alreadyReturned[countObject.string] = true; return true; } else { return false; } } );
It starts by calling map () on all string instances and creating or updating objects in the count array for each string Each object has a string and a count field
The result array will contain one entry for each string instance, and the count value of each entry is greater than the previous entry of the same string Then sort the results by count field, and the results are as follows:
{ count = 1028.0,string = *null* } { count = 1027.0,string = *null* } { count = 1026.0,string = *null* } ...
(string "* null *" is the most common in my test)
The last step is to filter each string using a function that returns true for its first occurrence It uses the returned array to track the contained string