Huge string table in Java
I have a question about storing a large number of strings in application memory I need to load about 5 million lines from the file, with a maximum of 255 characters (URL) per line, but most of them are ~ 50 I sometimes need to search one of them Is it possible to run this application on ~ 1GB ram?
take
ArrayList <String> list = new ArrayList<String>();
Work?
As far as I know, string in Java is encoded in UTF - 8, which gives me a huge amount of memory Can an ANSI encoded string be used to generate such an array?
This is a console application that runs with parameters
java -Xmx1024M -Xms1024M -jar "PServer.jar" nogui
Solution
The latest JVM supports - XX: usecompressedstrings by default, which stores strings that only use ASCII internally as byte []
A few gigabytes of text in the list is not a problem, but it may take some time (many seconds) to load from disk
If the average URL is 50 characters, these characters are ASCII, and each string has an overhead of 32 bytes, five entries can use about 400 MB, which is not much for modern PCs or servers