Java – what is the optimal initial capacity of a StringBuffer for input with a very variable length?

Good afternoon, everyone. I use Java Lang. StringBuilder to store some roles I don't know how many roles I want to store in advance, except:

>60% of the time, it's just (exactly) 7 characters > 39% of the time, (about) 3500 characters > 1% of the time, about 20K characters

How do we calculate the optimal initial buffer length that should be used?

I am currently using the new Java Lang. StringBuilder (4000) but it's just because I was too lazy to think before

Solution

There are two factors: time and memory consumption Time is mainly called by Java lang.AbstractStringBuilder. The number of expandcapacity() Of course, the cost of each call is linear with the current size of the buffer, but I simplify and calculate them here:

Quantity (time) of expandcapacity()

Default configuration (16 character capacity)

>In 60% of cases, StringBuilder will expand 0 times > in 39% of cases, StringBuilder will expand 8 times > in 1% of cases, StringBuilder will expand 11 times

The expected number of expandcapacity is 3,23

The initial capacity is 4096 characters

>In 99% of cases, StringBuilder will expand 0 times > in 1% of cases, StringBuilder will expand 3 times

The expected number of expandcapacities is 0,03

As you can see, the second case seems to be much faster because it rarely needs to extend StringBuilder (three times every 100 inputs) Note, however, that the first expansion is less important (copying a small amount of memory); In addition, if you add strings to the builder with huge blocks, it will expand more enthusiastically in fewer iterations

On the other hand, memory consumption increased:

Memory consumption

Default configuration (16 character capacity)

>In 60% of cases, StringBuilder will occupy 16 characters > in 39% of cases, StringBuilder will occupy 4K characters > in 1% of cases, StringBuilder will occupy 32K characters

The expected average memory consumption is 1935 characters

The initial capacity is 4096 characters

>In 99% of cases, StringBuilder will occupy 32K characters

The expected average memory consumption is: 4383 characters

TL; DR

This leads me to believe that expanding the initial buffer to 4K will more than double the memory consumption and accelerate the program by two orders of magnitude

The bottom line is: try! It is not difficult to write a benchmark that can handle millions of strings of different lengths with different initial capacities But I believe that a larger buffer may be a good choice

The content of this article comes from the network collection of netizens. It is used as a learning reference. The copyright belongs to the original author.
THE END
分享
二维码
< <上一篇
下一篇>>