Jdk8 string class knowledge summary

1、 Overview

Java's string class can be said to be the most practical class in daily life, but most of the time it is just a simple splicing or calling API. Today, I decided to learn more about the string class.

To understand a class for the first time, nothing is more intuitive than the official Javadoc document:

According to the documentation, for the string class, we focus on three issues:

1、 Immutability of string object

1. Why is string immutable

The document mentions:

For this passage, we look at it in combination with the source code;

public final class String
    implements java.io.Serializable,Comparable<String>,CharSequence {
    /** The value is used for character storage. */
    private final char value[];

    /** Cache the hash code for the string */
    private int hash; // Default to 0
}

We can see that string characters are actually the secondary encapsulation of char array objects. The storage variable value [] is modified by final. Therefore, the value of a string object cannot be changed after it is created, which is the same as that of a wrapper class.

Our common way of writing:

String s = "AAA";
s = "BBB";

In fact, two string objects are created. We use = to point s from AAA memory address to BBB memory address.

Let's take another look at the familiar substring () method:

public String substring(int beginIndex,int endIndex) {
    ... ...
    return ((beginIndex == 0) && (endIndex == value.length)) ? this
        : new String(value,beginIndex,subLen);
}

It can be seen that a new string object is returned at the end. Similarly, tolowercase(), trim() and other methods that return strings also return a new object at the end.

2. Necessity of immutability of string

String is designed to be immutable for efficiency and security:

2、 String constant pool

1. Function

It is mentioned in the document:

The string constant pool is a special area for recording string constants (for details, please refer to my article on JVM memory structure). Before jdk8, the string constant pool was in the runtime constant pool of the method area, and after jdk8, it was separated into the heap. "Sharing" operation depends on the string constant pool.

We know that string is an object and value [] is an immutable value, so when we use string in our daily life, we will frequently create new string objects. In order to improve performance and reduce memory overhead, the JVM will first check whether the same string exists in the constant pool through operations such as string s = "AAA". If it already exists, it will directly return the string instance address. Otherwise, it will first instance a string object into the pool and then return the address.

for instance:

String s1 = "aaa";  
String s2 = "aaa"; 
System.out.print(s1 == s2);   // true

We know that "= =" when comparing objects, we compare whether the memory addresses are equal. When S1 is created, an "AAA" string object is created and put into the pool, and S1 points to the object address; When the second S2 is assigned, the JVM finds the string object with the value "AAA" from the constant pool, so it skips the creation process and directly assigns the object address pointed to by S1 to S2

2. Pool entry method (Intern)

Here, we will mention the manual pooling method Intern () of string object.

The annotation of this method is as follows:

For example:

String s1 = "aabb";
String s2 = new String("aabb");
System.out.println(s1 == s2); //false
System.out.println(s1 == s2.intern()); //true

At first, S1 creates the "AABB" object a and adds the string constant pool. Then S2 creates a new "AABB" object B, which is in the heap and independent of the constant pool. At this time, S1 points to a in the constant pool and S2 points to B outside the constant pool, so = = the return is false.

We manually enter the pool using the intern () method. There is already object a with a value equal to "AABB" in the string constant pool, so the address of object a is directly returned. At this time, S1 and S2 point to object a in memory, so = = returns true.

3、 How string objects are created

From the above, we know that the creation of a string object is closely related to the string constant pool, and there are two ways to create a new string object:

1. Create in literal form

When creating a string object with literal value, whether to create a new string object will be determined according to whether the string already exists in the string constant pool.

When we use code like string s = "a" to create a string constant, the JVM will first check whether the string "a" is in the constant pool:

2. Use the new keyword to create

When you use the string keyword to create a string object, a new instance will be created regardless of whether there is an object of the same value in the string constant pool.

Look at the comments of the constructor called by new:

When we use the new keyword to create a string object, as in literal form, the JVM will check whether there is an object of the same value in the string constant pool:

In other words, there will only be one object created with literal value, but there may be two objects created with new (only one in the heap, or one in the heap and one in the constant pool).

Let's take an example:

String s1 = "aabb";
String s2 = new String("aabb");
String s3 = "aa" + new String("bb");
String s4 = new String("aa") + new String("bb");
System.out.println(s1 == s2); //false
System.out.println(s1 == s3); //false
System.out.println(s1 == s4); //false

System.out.println(s2 == s3); //false
System.out.println(s2 == s4); //false

System.out.println(s3 == s4); //false

We can see that the four string objects are independent of each other.

In fact, the object in memory after execution is as follows:

3. Summary

4、 String splicing

As we know, string often uses splicing operations, which depends on the StringBuilder class. In fact, string classes include not only string, but also StringBuilder and StringBuffer.

In short, the main difference between StringBuilder and StringBuffer and string is that the latter two are variable character sequences. Each change is aimed at the object itself, rather than directly creating a new object like string and then changing the reference.

1.StringBuilder

Let's first look at how its Javadoc is introduced:

We know that the main function of this class is to dynamically extend (append ()) and change the value of string object (insert ()).

Let's compare string and StringBuilder:

//String
public final class String
    implements java.io.Serializable,CharSequence{}

//StringBuilder
public final class StringBuilder extends AbstractStringBuilder
    implements java.io.Serializable,CharSequence{}

It is not difficult to see that the difference between the two is that string implements the comparable interface, while stringbulier inherits the abstract class abstractstringbuilder. The extensibility of the latter comes from abstractstringbuilder.

Abstractstringbuilder, like string, uses a char array to save string values, but this char array is not final modified and is variable.

The char array has an initial size, which is similar to the collection container. When the length of the append string exceeds the capacity of the current char array, the char array is dynamically expanded, that is, re apply for a larger memory space, and then copy the current char array to a new location; Otherwise, the volume will be reduced appropriately.

Generally, the new array length defaults to: (old array length + new character length) * 2 + 2. (not very accurate. If you want to know more, please refer to the newcapacity () method in the abstractstringbuilder class source code)

2. Plus sign splicing and append splicing

We usually use plus sign splicing directly on strings. In fact, it still depends on the append () method of StringBuilder.

for instance:

String s = "";
for(int i = 0; i < 10; i++) {
    s += "a";
}

This writing method will actually become similar after compilation:

String s = "";
for (int i = 0; i < 10; i++) {
    s = (new StringBuilder(String.valueOf(s))).append("a").toString();
}

We can see that each loop will generate a new StringBuilder object, which is undoubtedly very inefficient. This is also why many articles on the Internet say that string should not be used for splicing strings in the loop, but StringBuilder. Because if we write it ourselves, we can write it like this:

StringBuilder s = new StringBuilder();
for (int i = 0; i < 10; i++) {
    s.append("a");
}

It is obviously more efficient than the writing method after compiler conversion.

After understanding the principle of plus sign splicing, we can see why string objects use the plus sign and return false by virtue of = =:

String s1 = "abcd";
String s2 = "ab";
String s3 = "cd";
String s4 = s1 + s2;
String s5 = "ab" + s3;
System.out.println(s1 == s4); //false
System.out.println(s1 == s5); //false

Analyze the above process. Both S1 + S2 and "ab" + S3 actually call StringBuilder to create a new object outside the string constant pool, so = = judgment returns false.

It is worth mentioning that if we encounter the combination of "constant + literal", it can be regarded as a simple literal:

String s1 = "abcd";
final String s3 = "cd";
String s5 = "ab" + s3;
System.out.println(s1 == s5); //true

To sum up:

Let's also look at its Javadoc, which is basically the same as StringBuilder. Let's skip:

As you can see, StringBuilder and StringBuffer added after jdk5 are "equivalent classes". The functions of the two classes are basically the same. The only difference is that StringBuffer is thread safe.

Looking at the source code, we can see that StringBuffer implements thread safety by adding the synchronized keyword to the member method, such as append ():

public synchronized StringBuffer append(Object obj) {
    toStringCache = null;
    super.append(String.valueOf(obj));
    return this;
}

In fact, almost all methods of StringBuffer are synchronized. It is not difficult to understand why StringBuffer is generally less efficient than StringBuilder, because all methods of StringBuffer are locked.

The content of this article comes from the network collection of netizens. It is used as a learning reference. The copyright belongs to the original author.
THE END
分享
二维码
< <上一篇
下一篇>>