Java – is there a big difference between UTF-8 and utf-16

I called a WebService, which gave me a response XML with UTF-8 encoding I checked it in Java using the getallheaders () method

Now, in my java code, I accept the response and do some processing It is then passed on to other services

Now, I Google and find that by default, the encoding of strings in Java is utf-16

In my response XML, one of the elements has a character Now this messed up my post - processing requests for other services

Instead of sending a message, it sent something messy Now I want to know, will these two codes really be very different? If I want to know what will convert from UTF-8 to utf-16, what should I do?

thank you

Solution

Both UTF - 8 and UTF - 16 are variable length codes However, in UTF-8, characters may occupy at least 8 bits, while in utf-16, character length starts with 16 bits

Key UTF-8 professionals:

>Basic ASCII characters, such as numbers, Latin characters, no accent, etc., occupy the same byte representation as us-ascii In this way, all us-ascii strings become valid UTF-8, which provides good backward compatibility in many cases. > There are no empty bytes, allowing strings ending in empty characters, which also introduces a lot of backward compatibility

Main UTF-8 disadvantages:

>Many common characters have different lengths, which slows down indexing and severely calculates string length

Key utf-16 professionals:

>The most reasonable characters, such as Latin, Cyrillic, Chinese and Japanese, can be expressed in 2 bytes Unless required by truly exotic characters, this means that the 16 bit subset of utf-16 can be used as a fixed length encoding to speed up indexing

Main utf-16 disadvantages:

>There are many empty bytes in the US - ascii string, which means there are no null - terminated strings and a lot of wasted memory

In general, utf-16 is usually more suitable for in memory representation, while UTF-8 is very suitable for text files and network protocols

The content of this article comes from the network collection of netizens. It is used as a learning reference. The copyright belongs to the original author.
THE END
分享
二维码
< <上一篇
下一篇>>