How to avoid memory waste when storing UTF – 8 characters (8 bits) in Java characters (16 bits) two-in-one?
I'm afraid I have questions about the details of a rather over saturated topic. I searched a lot, but I can't find a clear answer - this is particularly obvious - important question:
When converting byte [] to string using UTF-8, each byte (8 bits) becomes an 8-bit character encoded by UTF-8, but each UTF-8 character is saved as a 16 bit character in Java Is that right? If so, does this mean that each stupid Java character uses only the first 8 bits and consumes twice as much memory? Is that right? I want to know how this waste is accepted
Is there any trick to have an 8-bit pseudo string? Does this actually reduce memory consumption? Or, is there a way to store 8-bit characters in > two < one Java 16bit characters, which can avoid this kind of memory waste? Thank you for any confusing answers... Editor: Hi, thank you for your answers I know the variable length attribute of UTF - 8 However, since my source is an 8 - bit byte, I understand (obviously wrong) that it only needs 8 - bit UTF - 8 words Does the UTF-8 conversion actually save the strange symbols you see when you see "cat somebinary" on the CLI? I think UTF - 8 is only used to map the bytes of each possible 8 - bit word to a specific 8 - bit word of UTF - 8 Wrong? I thought about using Base64, but it's terrible because it only uses 7 bits
Reformulated question: is there a smarter way to convert bytes to strings? Perhaps my favorite is to convert byte [] to char [], but after that, I still have 16 bit words
Other use case information:
I am adjusting jedis (the Java client of NoSQL redis) as the "original storage layer" of hypergraphdb Therefore, jedis is another "database" database My problem is that I must always provide jedis with byte [] data, but internally, > redis < (actual server) only processes "binary security" strings Since redis is written in C language, char is 8 bits long and AFAIK is 7 bits instead of ASCII However, in jedis, in the Java world, each character is 16 bits long internally I don't know this code yet, but I want jedis to convert this Java 16 bit string into a redis compliant 8-bit string ([here] [3]) It says it extends filteroutputstream I want to bypass it byte [] < - > string full conversion and use filteroutputstream...?)
Now I want to know: if I have to swap byte [] and string all the time, and the amount of data is from very small to possibly very large, then passing each 8-bit character into 16 bits in Java won't waste a lot of memory?
Solution
Yes, please make sure you have the latest version of Java
The above is how to avoid memory waste when storing UTF-8 characters (8 bits) in Java characters (16 bits) two-in-one? I hope this article can help you solve how to avoid memory waste when storing UTF-8 characters (8 bits) in Java characters (16 bits) two-in-one? Program development problems encountered.
If you think the content of the programming home website is good, you are welcome to recommend the programming home website to programmers and friends.