Java – slow initialization of large and small objects
I came across this case today. I want to know the reason behind this huge difference
The first version initializes a 5K x 5K raw int array:
public void initializeRaw() { int size = 5000; int[][] a = new int[size][size]; for (int i = 0; i < size; i++) for (int j = 0; j < size; j++) a[i][j] = -1; }
My machine takes about 300 milliseconds On the other hand, initialize the same array with a simple 2-Int structure:
public class Struct { public int x; public int y; } public void initializeStruct() { int size = 5000; Struct[][] a = new Struct[size][size]; for (int i = 0; i < size; i++) for (int j = 0; j < size; j++) a[i][j] = new Struct(); }
Cost more than 15000ms
I hope it's a little slow. After all, there's more memory to allocate (10 bytes instead of 4 bytes if I'm not mistaken), but I don't understand why it takes 50 times as long
Can anyone explain? Perhaps there is only a better way to do this initialization in Java?
Edit: for some comparisons – 700ms with the same code using integer instead of int / struct – only twice as slow
Solution
When you create an array of 5000 ints, you will allocate all the space required for all these integers as a single continuous element block at one time When you assign an int to each array element, you do not assign anything This is in contrast to an array of 5000 struct instances You traverse the array and assign a struct instance to each of the 5000 elements Allocating objects takes longer than simply writing int values to variables
The fact that you have a two-dimensional array doesn't make much difference here, because it just means that you allocate 5000 array objects in both cases
If you want to time a group of integer objects and set each element to - 1, no separate integer object will be assigned each time Instead, you are using auto boxing, which means that the compiler implicitly calls integer Valueof (- 1), and the method returns the same object from the cache every time
Update: go back and solve your problem. If I understand correctly, you need to send 5000 × 5000 structs remains in the 2D array, and you are disappointed that creating this array takes longer than using primitives To improve performance, you can create two primitive arrays, one for each field of struct, but this will reduce the clarity of the code
You can also create an array of long (because each long is twice the length of int) and use the & and > > operators to get the original integer Again, this reduces code clarity, but you only have one array
However, you seem to focus on a single part of the code, the creation of arrays You may find that the processing performed on each element will take too much time to create the array Analyze the entire application to see if array creation is important