Analysis of the underlying implementation principle of Java serialization and deserialization

1、 Basic concepts

1. What are serialization and deserialization

(1) Java serialization refers to the process of converting Java objects into byte sequences, while Java deserialization refers to the process of restoring byte sequences into Java objects;

  (2) * * serialization: * * the most important use of object serialization is to ensure the integrity and transitivity of objects when passing and saving objects. Serialization is to convert objects into ordered byte streams for transmission on the network or saving in local files. The serialized byte stream saves the state of Java objects and related description information. Serialization mechanism The core function is the preservation and reconstruction of object state.

(3) * * deserialization: * * after the client obtains the serialized object byte stream from the file or the network, the object is reconstructed through deserialization according to the object state and description information saved in the byte stream.

(4) in essence, serialization is to write the entity object state into the ordered byte stream according to a certain format, and deserialization is to reconstruct the object from the ordered byte stream and restore the object state.

2. Why do I need serialization and deserialization

We know that when two processes communicate remotely, they can send various types of data to each other, including text, pictures, audio, video, etc., and these data will be transmitted on the network in the form of binary sequence.

So when two Java processes communicate, can object transfer between processes be realized? The answer is yes! How? This requires Java serialization and deserialization!

In other words, on the one hand, the sender needs to convert the Java object into a byte sequence and then transmit it on the network; On the other hand, the receiver needs to recover the Java object from the byte sequence.

When we understand why we need Java serialization and deserialization, we naturally think about the benefits of Java serialization. The first advantage is to realize the persistence of data. Through serialization, the data can be permanently saved to the hard disk (usually stored in the file). The second is to realize remote communication by serialization, that is, to transmit the byte sequence of the object on the network.

In general, it can be summarized as follows:

(1) Permanently save the object, and save the byte sequence of the object to the local file or database;

(2) The object is transmitted and received in the network in the form of byte stream through serialization;

(3) Passing objects between processes through serialization;

3. The serialization algorithm generally does the following steps:

(1) Export the class metadata related to the object instance.

(2) Recursively output the superclass description of the class until there are no more superclasses.

(3) After the class metadata is completed, start outputting the actual data value of the object instance from the top-level superclass.

(4) Recursively output the data of the instance from top to bottom

2、 How does Java implement serialization and deserialization

1. Serialization and deserialization API in JDK class library

(1) Java.io.objectoutputstream: represents the object output stream;

Its writeobject (object obj) method can serialize the obj object specified by the parameter and write the resulting byte sequence to a target output stream;

(2) Java.io.objectinputstream: represents the object input stream;

Its readObject () method reads byte sequences from the source input stream, deserializes them into an object, and returns them;

2. Requirements for serialization

Only objects of classes that implement the serializable or externalizable interface can be serialized, otherwise an exception will be thrown!

3. Method for realizing Java object serialization and deserialization

Suppose a user class whose objects need to be serialized. There are three methods:

(1) If user class only implements serializable interface, it can be serialized and deserialized as follows

(2) If the user class only implements the serializable interface and also defines readObject (objectinputstream in) and writeobject (objectoutputstream out), the following methods are used for serialization and deserialization.

(3) If the user class implements the externalizable interface, and the user class must implement the readexternal (objectinput in) and writeexternal (objectoutput out) methods, serialize and deserialize as follows.

4. Steps of serialization in JDK class library

Step 1: create an object output stream, which can wrap a target output stream of other types, such as file output stream:

ObjectOutputStream oos = new ObjectOutputStream(new FileOutputStream("D:\\object.out"));

Step 2: write the object through the writeobject() method of the object output stream:

oos.writeObject(new User("xuliugen","123456","male"));

5. Steps of deserialization in JDK class library

Step 1: create an object input stream, which can wrap other types of input streams, such as file input stream:

ObjectInputStream ois= new ObjectInputStream(new FileInputStream("object.out"));

Step 2: read the object through the readobject() method of the object output stream:

User user = (User) ois.readObject();

Note: in order to correctly read data and complete deserialization, it is necessary to ensure that the order of writing objects to the object output stream is consistent with the order of reading objects from the object input stream.

6. Examples of serialization and deserialization

To better understand java serialization and deserialization, take a simple example as follows:

public class SerialDemo {

  public static void main(String[] args) throws IOException,ClassNotFoundException {
    //序列化
    FileOutputStream fos = new FileOutputStream("object.out");
    ObjectOutputStream oos = new ObjectOutputStream(fos);
    User user1 = new User("xuliugen","male");
    oos.writeObject(user1);
    oos.flush();
    oos.close();
    //反序列化
    FileInputStream fis = new FileInputStream("object.out");
    ObjectInputStream ois = new ObjectInputStream(fis);
    User user2 = (User) ois.readObject();
    System.out.println(user2.getUserName()+ " " +
      user2.getpassword() + " " + user2.getSex());
    //反序列化的输出结果为:xuliugen 123456 male
  }
}

public class User implements Serializable {
  private String userName;
  private String password;
  private String sex;
  //全参构造方法、get和set方法省略
}

object. The out file is as follows (open with UltraEdit):

Note: 0000000h - 0000000c0h in the above figure indicates the line number; 0-F indicates column; The text after the line indicates the hexadecimal interpretation of the line; If you are interested in the content expressed by the above bytecode, you can refer to the meaning represented by each character by referring to relevant materials. It is not discussed here!

Similar to our java code after compilation Class file, each character represents a certain meaning. The process of serialization and deserialization is the process of generating and parsing the above characters!

Serialization diagram:

Deserialization diagram:

3、 Relevant precautions

1. When serializing, only the state of the object is saved, regardless of the method of the object;

2. When a parent class implements serialization, the child class automatically implements serialization without explicitly implementing the serializable interface;

3. When the instance variable of an object references other objects, the reference object is also serialized when the object is serialized;

4. Not all objects can be serialized. There are many reasons why not, such as:

For security reasons, for example, an object has private, public and other fields. For an object to be transmitted, such as writing to a file or RMI transmission, the private and other fields of the object are not protected during serialization and transmission;

For resource allocation reasons, such as socket and thread classes, if they can be serialized, transmitted or saved, they cannot be reallocated, and it is not necessary to implement them in this way;

5. Member data declared as static and transient types cannot be serialized. Because static represents the state of the class and transient represents the temporary data of the object.

6. The serialization runtime uses a version number called serialVersionUID to associate with each serializable class. The serial number is used to verify whether the sender and receiver of the serialized object have loaded classes compatible with serialization for the object during deserialization. Give it a clear value. Explicitly defining serialVersionUID has two purposes:

In some cases, you want different versions of a class to be compatible with serialization, so you need to ensure that different versions of a class have the same serialVersionUID;

In some cases, you do not want different versions of a class to be serialization compatible, so you need to ensure that different versions of a class have different serialversionuids.

7. Many basic classes in java have implemented serializable interfaces, such as string, vector, etc. However, there are some that do not implement the serializable interface;

8. If the member variable of an object is an object, the data members of the object will also be saved! This is an important reason why serialization can be used to solve deep copy;

4、 Deserialization vulnerability

Related deserialization vulnerabilities can be found in related materials, which will not be introduced here.

The above is the whole content of this article. I hope it will help you in your study, and I hope you will support us a lot.

The content of this article comes from the network collection of netizens. It is used as a learning reference. The copyright belongs to the original author.
THE END
分享
二维码
< <上一篇
下一篇>>