Java – read and write files that contain UTF – 8 (different language) characters

I have a file, It contains the following characters: “Joh 1:1ஆதியிலேஆதியிலே்த்தைதை்தது,அந்ததவாரதததைதைதைதைதைதைதைதேவனிடதததுததுததுததுததுததுததுததுததுததுததுததுததுததுததுததுததுததுதத ுததுததுததுதது”“”“”“”“”“

www.unicode. org/charts/PDF/U0B80. pdf

When I use the following code:

bufferedWriter = new BufferedWriter (new OutputStreamWriter(System.out,"UTF8"));

The output is a box and other strange characters, as follows:

“P = O ֛;< A Y ՠ;”

Can I help you?

These are the complete code:

File f=new File("E:\\bible.docx");
        Reader decoded=new InputStreamReader(new FileInputStream(f),StandardCharsets.UTF_8);
        bufferedWriter = new BufferedWriter (new OutputStreamWriter(System.out,StandardCharsets.UTF_8));
        char[] buffer = new char[1024];
        int n;
        StringBuilder build=new StringBuilder();
        while(true){
            n=decoded.read(buffer);
            if(n<0){break;}
            build.append(buffer,n);
            bufferedWriter.write(buffer);
        }

The StringBuilder value displays UTF characters, but when displayed in a window, it displays as a box

Find the answer to the question!!! The encoding is correct (i.e. UTF-8). Java reads the file as UTF-8 and the string character is UTF-8. The problem is that there is no font to display it in the output panel of NetBeans After changing the font of the output panel (NetBeans - > tools - > Options - > misc - > Output tab), I got the expected results The same applies when it is displayed in jtextarea (the font needs to be changed) But we can't change the windows' CMD prompt font

Solution

Because your output is encoded in UTF-8, but still contains replacement characters (U fffd,), I believe there will be problems when you read data

Make sure you know the encoding used by the input stream and set the encoding according to the inputstreamreader If that's Tamil, I guess it could be UTF-8 I don't know if Java supports tace-16 It looks like this

StringBuilder buffer = new StringBuilder();
try (InputStream encoded = ...) {
  Reader decoded = new InputStreamReader(encoded,StandardCharsets.UTF_8);
  char[] buffer = new char[1024];
  while (true) {
    int n = decoded.read(buffer);
    if (n < 0)
      break;
    buffer.append(buffer,n);
  }
}
String verse = buffer.toString();
The content of this article comes from the network collection of netizens. It is used as a learning reference. The copyright belongs to the original author.
THE END
分享
二维码
< <上一篇
下一篇>>