Summary and solution of garbled code in Java
Garbled code in Java
In recent projects, we often encounter the problem of garbled code in Java, so we took time to sort out the situation of garbled code and how to deal with it. Here is a sorting,
analysis
Encoding and decoding
Encoding is to convert characters into bytes, and decoding is to convert bytes into characters.
Byte stream and character stream
Read and write operations to files are implemented through byte stream. Even if there is a character stream in Java, the underlying byte stream is still used.
Garbled code problem
Characters are most frequently used in Java. When we read files into memory and display them on the console (byte stream -- > character stream), we need to decode them. If the file is encoded in UTF-8 and we use GBK (if the encoding is not specified, Java will adopt the system default encoding) to decode, only garbled codes can be displayed. When we write files, we'd better specify the encoding (UTF-8).
Solution
Example 1
When converting a byte stream to a character stream, we specify the encoding format. This is our file. It should also be encoded in GB2312
Example 2
Read directly through the byte stream, and specify the encoding when converting to characters using string.
trap
There is a FileReader class in I / O operation. This class hides the details of byte flow into character flow. We can use it this way. BufferedReader in = new BufferedReader(new FileReader(filename)); In this way, we get the character stream directly. However, we found that we did not set the encoding because the default encoding method is adopted in FileReader. This becomes very dangerous. If the default encoding format is different from that of our files, the read data must be garbled. Therefore, we'd better use the method in the example to convert the flow.
Thank you for reading, hope to help you, thank you for your support to this site!