Converting ANSI characters to UTF-8 in Java
Is there any way to convert ANSI strings to UTF using Java
I have one using readutf & amp; Custom serializer for The writeutf method of the datainputstream class is used to deserialize and serialize strings If I receive an ANSI encoded string that is too long, about 100000 characters, I get an error;
However, in my JUnit test, I was able to create a 120000'a string, which was perfect
I have checked the following posts, but there are still errors;
> Converting UTF-8 to ISO-8859-1 in Java – how to keep it as single byte > How do I replace accented Latin characters in Ruby?
Solution
This error is not caused by character encoding This means that the length of UTF data is wrong
Editor: just realized that this is a write error, not a read error
UTF is only 2 bytes long, so it can only hold 64K UTF - 8 bytes You're trying to write 100k, it won't work
This limitation is hard coded and cannot solve this problem,
if (utflen > 65535) throw new UTFDataFormatException( "encoded string too long: " + utflen + " bytes");