Serialization and deserialization in Java
1、 Serialization and deserialization concepts
Serialization is a process of describing objects in a series of bytes; deserialization is a process of reconstructing these bytes into an object. Serialization is the process of saving objects in a program into a file, and deserialization is the process of re converting bytecode in a file into objects.
2、 The necessity of serialization and deserialization
When two processes communicate remotely, they can send various types of data to each other, including text, picture, audio, video, etc., and these data will be transmitted in the form of binary sequence on the network. Java is an object-oriented development method. Everything is Java objects. If you want to realize the network transmission of Java objects, you can use serialization and deserialization. The sender serializes and converts the Java object to be sent into a byte sequence, and then transmits it on the network; After receiving the character sequence, the receiver uses deserialization to recover the Java object from the byte sequence.
When we understand why we need Java serialization and deserialization, we naturally think about the benefits of Java serialization. The first advantage is to realize the persistence of data. Through serialization, the data can be permanently saved to the hard disk (usually stored in the file); the second is to realize remote communication by serialization, that is, to transmit the byte sequence of the object on the network.
In conclusion, the data transmission in the network must be in the form of serialization. Other serialization methods can be JSON transmission and XML transmission.
3、 Implementation of serialization and deserialization
1) Serialization API provided by JDK class library
2) Requirements for serialization
Only objects of classes that implement the serializable or externalizable interface can be serialized, otherwise an exception will be thrown.
3) Method for realizing Java object serialization and deserialization
Suppose a student class whose objects need to be serialized. There are three methods:
4) Steps of serialization in JDK class library
Step 1: create an object output stream, which can wrap a target output stream of other types, such as file output stream:
ObjectOutputStream out = new ObjectOutputStream(new fileOutputStream(“D:\\objectfile.obj”));
Step 2: write the object through the writeobject() method of the object output stream:
out.writeObject(“Hello”);
out.writeObject(new Date());
5) Steps of deserialization in JDK class library
Step 1: create an object input stream, which can wrap other types of input streams, such as file input stream:
ObjectInputStream in = new ObjectInputStream(new fileInputStream(“D:\\objectfile.obj”));
Step 2: read the object through the readobject() method of the object output stream:
String obj1 = (String)in.readObject();
Date obj2 = (Date)in.readObject();
Note: in order to correctly read data and complete deserialization, it is necessary to ensure that the order of writing objects to the object output stream is consistent with the order of reading objects from the object input stream. In order to better understand java serialization and deserialization, method 1 coding implementation is selected. The student class is defined as follows:
/**
* 实现了序列化接口的学生类
*/
public class Student implements Serializable {
private String name;
private char sex;
private int year;
private double gpa;
public Student() {
}
public Student(String name,char sex,int year,double gpa) {
this.name = name;
this.sex = sex;
this.year = year;
this.gpa = gpa;
}
public void setName(String name) {
this.name = name;
}
public void setSex(char sex) {
this.sex = sex;
}
public void setYear(int year) {
this.year = year;
}
public void setGpa(double gpa) {
this.gpa = gpa;
}
public String getName() {
return this.name;
}
public char getSex() {
return this.sex;
}
public int getYear() {
return this.year;
}
public double getGpa() {
return this.gpa;
}
}
Serialize the object of the student class to the file / users / sschen / documents / student Txt and deserialize from the file to display the results to the console. The code is as follows:
public class UserStudent {
public static void main(String[] args) {
Student st = new Student("Tom",'M',20,3.6);
File file = new File("/Users/sschen/Documents/student.txt");
try {
file.createNewFile();
}
catch(IOException e) {
e.printStackTrace();
}
try {
//Student对象序列化过程
FileOutputStream fos = new FileOutputStream(file);
ObjectOutputStream oos = new ObjectOutputStream(fos);
oos.writeObject(st);
oos.flush();
oos.close();
fos.close();
//Student对象反序列化过程
FileInputStream fis = new FileInputStream(file);
ObjectInputStream ois = new ObjectInputStream(fis);
Student st1 = (Student) ois.readObject();
System.out.println("name = " + st1.getName());
System.out.println("sex = " + st1.getSex());
System.out.println("year = " + st1.getYear());
System.out.println("gpa = " + st1.getGpa());
ois.close();
fis.close();
}
catch(ClassNotFoundException e) {
e.printStackTrace();
}
catch (IOException e) {
e.printStackTrace();
}
}
}
Instead, view the file / users / sschen / documents / student Txt, the content saved in it is not easy to read:
aced 0005 7372 001f 636f 6d2e 7373 6368
656e 2e53 6572 6961 6c69 7a61 626c 652e
5374 7564 656e 74f1 5dbd a4a0 3472 4d02
0004 4400 0367 7061 4300 0373 6578 4900
0479 6561 724c 0004 6e61 6d65 7400 124c
6a61 7661 2f6c 616e 672f 5374 7269 6e67
3b78 7040 0ccc cccc cccc cd00 4d00 0000
1474 0003 546f 6d
4、 Necessary conditions for serialization
1. Must be the same package with the same name. 2. SerialVersionUID must be consistent. Sometimes, when the attributes of two classes are slightly inconsistent, you can serialize and deserialize by writing this attribute to a dead value.
5、 Serialization advanced, using situational analysis
1. Serialization id problem
Situation: two clients a and B try to transfer object data through the network. End a serializes Object C into binary data and then transmits it to B. B deserializes it to obtain C. Problem: the full class path of C object is assumed to be com inout. Test, there is such a class file at both ends a and B, and the function code is exactly the same. The serializable interface is also implemented, but the deserialization always indicates that it is unsuccessful. Solution: whether the virtual machine allows deserialization depends not only on whether the class path and function code are consistent, A very important point is whether the serialization IDs of the two classes are consistent (that is, private static final long serialVersionUID = 1L). In the following code, although the function codes of the two classes are completely consistent, the serialization IDs are different, and they cannot serialize and deserialize each other.
In short, the serialization mechanism of Java verifies the version consistency by judging the serialVersionUID of the class at run time. When deserializing, The JVM will compare the serialVersionUID in the transmitted byte stream with the serialVersionUID of the corresponding local entity (class). If they are the same, they are considered to be consistent and can be deserialized. Otherwise, there will be an exception of inconsistent serialization version. When the entity implementing the java.io.serializable interface When (class) does not explicitly define a variable named serialVersionUID and type long, the Java serialization mechanism will automatically generate a serialVersionUID according to the compiled class for sequential version comparison. In this case, only the classes generated in the same compilation will generate the same serialVersionUID.
If we do not want to forcibly divide the software version through compilation, that is, the entities implementing the serialization interface can be compatible with the previous version and unchanged classes, we need to explicitly define a variable named serialVersionUID and type long. The sequenced entities that do not modify the value of this variable can serialize and deserialize each other. The comparison of classes with the same function code and different serialization IDS is as follows:
public class SerialVersionIDA implements Serializable {
private static final long serialVersionUID=1L;
private String name;
public String getName() {
return name;
}
public void setName(String name) {
this.name = name;
}
public SerialVersionIDA() {
}
public SerialVersionIDA(String name) {
this.name = name;
}
}
public class SerialVersionIDA implements Serializable {
private static final long serialVersionUID=2L;
private String name;
public String getName() {
return name;
}
public void setName(String name) {
this.name = name;
}
public SerialVersionIDA() {
}
public SerialVersionIDA(String name) {
this.name = name;
}
}
Serializing a class with serialVersionUID of 1L and deserializing a class with serialVersionUID of 2l will prompt an exception. The exception content is:
java.io.InvalidClassException: com.sschen.Serializable.SerialVersionIDA; local class incompatible: stream classdesc serialVersionUID = 2,local class serialVersionUID = 1
at java.io.ObjectStreamClass.initNonProxy(ObjectStreamClass.java:616)
at java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1630)
at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1521)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1781)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1353)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:373)
at com.sschen.Serializable.SerialVersionTest.main(SerialVersionTest.java:30)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at com.intellij.rt.execution.application.AppMain.main(AppMain.java:147)
Serialization ID provides two generation strategies under eclipse. One is fixed 1L, One is to randomly generate a non duplicate long type data (in fact, it is generated using JDK tools). Here is a suggestion. If there are no special requirements, it is OK to use the default 1L, so as to ensure the successful deserialization when the code is consistent. What is the role of the randomly generated serialization ID? Sometimes, it can be used to restrict the use of some users by changing the serialization ID.
Readers of feature use cases should have heard of FA ç ade mode, which provides a unified access interface for applications. The client in the case program uses this mode. The case program structure is shown in the figure below.
2. Static variable serialization
public class SerialStaticTest implements Serializable {
private static final long serialVersionUID = 1L;
public static int staticVar = 5;
public static void main(String[] args) {
try {
File file = new File("/Users/sschen/Documents/student.txt");
try {
file.createNewFile();
}
catch(IOException e) {
e.printStackTrace();
}
//初始时staticVar为5
ObjectOutputStream out = new ObjectOutputStream(
new FileOutputStream(file));
out.writeObject(new SerialStatictest());
out.close();
//序列化后修改为10
SerialStaticTest.staticVar = 10;
ObjectInputStream oin = new ObjectInputStream(new FileInputStream(file));
SerialStaticTest t = (SerialStaticTest) oin.readObject();
oin.close();
//再读取,通过t.staticVar打印新的值
System.out.println(t.staticVar);
} catch (FileNotFoundException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
} catch (ClassNotFoundException e) {
e.printStackTrace();
}
}
}
The main method in the above code saves the object serialization to a file, modifies the value of the static variable, reads out the serialized object, and then obtains the value of the static variable through the read object and prints it. According to the code, this system out. Does the println (t.static var) statement output 10 or 5? The final output is 10. For readers who cannot understand it, the printed staticvar is obtained from the read object and should be in the saved state. The reason for printing 10 is that static variables are not saved during serialization, which is actually easier to understand. Serialization saves the state of the object, and static variables belong to the state of the class. Therefore, serialization does not save static variables.
3. Serialization of parent class and transient keyword
Situation: a subclass implements the serializable interface, and its parent class does not implement the serializable interface. Serialize the subclass object, and then output the value of a variable defined by the parent class after deserialization. The value of the variable is different from that during serialization. Solution: to serialize the parent class object, you need to make the parent class implement the serializable interface. If the parent class is not implemented, it needs to have a default parameterless constructor. When the parent class does not implement the serializable interface, the virtual machine will not serialize the parent object, and the construction of a Java object must have a parent object before it has a child object, and deserialization is no exception. Therefore, during deserialization, in order to construct the parent object, you can only call the parameterless constructor of the parent class as the default parent object. Therefore, when we take the variable value of the parent object, its value is the value after calling the parameterless constructor of the parent class. If you consider this serialization, initialize the variables in the parameterless constructor of the parent class. Otherwise, the values of the parent class variables are declared by default. For example, the default of int type is 0, and the default of string type is null. The function of transient keyword is to control the serialization of variables. Adding this keyword before the variable declaration can prevent the variable from being serialized into the file. After being deserialized, the value of transient variable is set to the initial value, such as 0 for int type and null for object type. Feature use cases we are familiar with using the transient keyword to prevent fields from being serialized. Is there any other way? According to the rules of parent class object serialization, we can extract the fields that do not need to be serialized and put them into the parent class. The child class implements the serializable interface, but the parent class does not. According to the parent class serialization rules, the field data of the parent class will not be serialized to form a class diagram, as shown in Figure 2.
4. Encrypt sensitive fields
Situation: the server sends serialized object data to the client. Some data in the object is sensitive, such as password string. You want to encrypt the password field during serialization. If the client has a decrypted key, the password can be read only when the client is deserialized, This can ensure the data security of serialized objects to a certain extent. Solution: during serialization, the virtual opportunity attempts to call the writeobject and readObject methods in the object class for user-defined serialization and deserialization. If there is no such method, the default calls are the defaultwriteobject method of objectoutputstream and the defaultreadobject method of objectinputstream. The user-defined writeobject and readObject methods allow users to control the serialization process. For example, the serialization value can be changed dynamically during the serialization process. Based on this principle, it can be used in practical applications for the encryption of sensitive fields. The following code shows this process.
public class SerialPwdTest implements Serializable {
private static final long serialVersionUID = 1L;
private String password = "pass";
public String getpassword() {
return password;
}
public void setPassword(String password) {
this.password = password;
}
private void writeObject(ObjectOutputStream out) {
try {
ObjectOutputStream.PutField putFields = out.putFields();
System.out.println("原密码:" + password);
password = "encryption";//模拟加密
putFields.put("password",password);
System.out.println("加密后的密码" + password);
out.writeFields();
} catch (IOException e) {
e.printStackTrace();
}
}
private void readObject(ObjectInputStream in) {
try {
ObjectInputStream.GetField readFields = in.readFields();
Object object = readFields.get("password","");
System.out.println("要解密的字符串:" + object.toString());
password = "pass";//模拟解密,需要获得本地的密钥
} catch (IOException e) {
e.printStackTrace();
} catch (ClassNotFoundException e) {
e.printStackTrace();
}
}
public static void main(String[] args) {
File file = new File("/Users/sschen/Documents/student.txt");
try {
file.createNewFile();
}
catch(IOException e) {
e.printStackTrace();
}
try {
ObjectOutputStream out = new ObjectOutputStream(new FileOutputStream(file));
out.writeObject(new SerialPwdtest());
out.close();
ObjectInputStream oin = new ObjectInputStream(new FileInputStream(file));
SerialPwdTest t = (SerialPwdTest) oin.readObject();
System.out.println("解密后的字符串:" + t.getpassword());
oin.close();
} catch (FileNotFoundException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
} catch (ClassNotFoundException e) {
e.printStackTrace();
}
}
}
In the writeobject method of serialpwdtest, the password is encrypted, serialized and saved to a file after encryption. In readObject, the password is decrypted after reading the password. Only the client with the key can correctly parse the password to ensure the security of the data. The execution result of the above code is:
原密码:pass
加密后的密码encryption
要解密的字符串:encryption
解密后的字符串:pass
Feature use case RMI technology is completely based on Java serialization technology. The parameter objects required by the server-side interface call to the client, which are transmitted to each other through the network. This involves the secure transmission of RMI. Some sensitive fields, such as user name and password (the user needs to transmit the password when logging in), we want to encrypt them. At this time, we can use the method described in this section to encrypt the password on the client and decrypt it on the server to ensure the security of data transmission.
5. Serialization storage rules
Situation: the question code is shown in Listing 4. Listing 4 Storage rule problem code
public class SerialSaveTest implements Serializable {
public static void main(String[] args) {
File file = new File("/Users/sschen/Documents/student.txt");
try {
file.createNewFile();
}
catch(IOException e) {
e.printStackTrace();
}
try {
ObjectOutputStream out = new ObjectOutputStream(new FileOutputStream(file));
SerialSaveTest test = new SerialSavetest();
//试图将对象两次写入文件
out.writeObject(test);
out.flush();
System.out.println(file.length());
out.writeObject(test);
out.close();
System.out.println(file.length());
ObjectInputStream oin = new ObjectInputStream(new FileInputStream(file));
//从文件依次读出两个文件
SerialSaveTest t1 = (SerialSaveTest) oin.readObject();
SerialSaveTest t2 = (SerialSaveTest) oin.readObject();
oin.close();
//判断两个引用是否指向同一个对象
System.out.println(t1 == t2);
} catch (IOException e) {
e.printStackTrace();
} catch (ClassNotFoundException e) {
e.printStackTrace();
}
}
}
In Listing 4, the same object is written to the file twice, the storage size after writing the object once and the storage size after writing twice are printed, and then two objects are deserialized from the file to compare whether the two objects are the same object. The general idea is that if the object is written twice, the file size will become twice the size. During deserialization, two objects are generated due to reading from the file. When judging whether they are equal, you should enter false, but the final result output is as follows:
59
64
true
We can see that the second time we write an object, the file only increases by 5 bytes, and the two objects are equal. Why? Answer: in order to save disk space, the Java serialization mechanism has specific storage rules. When the file is written to the same object, the content of the object will not be stored, but only a reference will be stored again. The 5-byte storage space added above is the space for adding references and some control information. During deserialization, restore the reference relationship so that T1 and T2 in Listing 3 point to a unique object. They are equal and output true. The storage rule greatly saves storage space. The feature case study looks at the code in Listing 5. Listing 5 Case code
public class SerialSaveTest implements Serializable {
private int id;
public int getId() {
return id;
}
public void setId(int id) {
this.id = id;
}
public static void main(String[] args) {
File file = new File("/Users/sschen/Documents/student.txt");
try {
file.createNewFile();
}
catch(IOException e) {
e.printStackTrace();
}
try {
ObjectOutputStream out = new ObjectOutputStream(new FileOutputStream(file));
SerialSaveTest test = new SerialSavetest();
test.setId(1);
//试图将对象两次写入文件
out.writeObject(test);
out.flush();
System.out.println(file.length());
test.setId(5);
out.writeObject(test);
out.close();
System.out.println(file.length());
ObjectInputStream oin = new ObjectInputStream(new FileInputStream(file));
//从文件依次读出两个文件
SerialSaveTest t1 = (SerialSaveTest) oin.readObject();
SerialSaveTest t2 = (SerialSaveTest) oin.readObject();
oin.close();
//判断两个引用是否指向同一个对象
System.out.println(t1 == t2);
System.out.println(t1.getId());
System.out.println(t2.getId());
} catch (IOException e) {
e.printStackTrace();
} catch (ClassNotFoundException e) {
e.printStackTrace();
}
}
}
The purpose of Listing 4 is to save the test object twice to / users / sschen / documents / student Txt file, modify the object attribute value after writing it once, save it again for the second time, and then start from / users / sschen / documents / student Txt, and output the I attribute values of the two objects. The purpose of the case code was originally to transfer the status of the object before and after modification at one time. As a result, both outputs are 1. The reason is that after writing the object for the first time, when trying to write again for the second time, the virtual machine knows that the same object has been written to the file according to the reference relationship, so only the reference written for the second time is saved. Therefore, when reading, it is the object saved for the first time. Readers should pay special attention to this problem when using a file to write objects multiple times.