Java – parses a 20 GB input file into an ArrayList
I need to sort 20 GB files (composed of random numbers) in ascending order, but I don't know which technology to use I tried to use ArrayList in my java program, but it ran out of memory Increasing the heap size doesn't work either. I guess 20 GB is too big Anyone can guide me. What should I do?
Solution
You should use an external sort algorithm instead of trying to put it in memory
http://en.wikipedia.org/wiki/External_sorting
If you think it's too complicated, try the following:
>Include H2 database in the project > create a new disk database (which will be created automatically at the first connection) > create a simple table for storing numbers > read data one by one and insert it into the database (don't forget to submit about every 1000 numbers) > use the order by clause to select numbers:) > use JDBC resultset to get the results immediately and write them to the output file
H2 database is very simple and can be used well with Java. It can be embedded in jar (no installation or setting is required)