How to substring a UTF8 string in Java?

Suppose I have the following string: R ü ckrufinsausland. I need to insert it into a database with a maximum size of 10

I need to write a function to execute this substring in Java, but considering that ü needs 2 bytes, the substring returned in this case should be R ü ckrufi (9 characters) Any suggestions?

Solution

If you want to trim data in Java, you must write a function to trim strings using the DB charset used, similar to this test case:

package test;

import java.io.UnsupportedEncodingException;

public class TrimField {

    public static void main(String[] args) {
        //UTF-8 is the db charset
        System.out.println(trim("Rückruf ins Ausland",10,"UTF-8"));
        System.out.println(trim("Rüückruf ins Ausland","UTF-8"));
    }

    public static String trim(String value,int numBytes,String charset) {
        do {
            byte[] valueInBytes = null;
            try {
                valueInBytes = value.getBytes(charset);
            } catch (UnsupportedEncodingException e) {
                throw new RuntimeException(e.getMessage(),e);
            }
            if (valueInBytes.length > numBytes) {
                value = value.substring(0,value.length() - 1);
            } else {
                return value;
            }
        } while (value.length() > 0);
        return "";

    }

}
The content of this article comes from the network collection of netizens. It is used as a learning reference. The copyright belongs to the original author.
THE END
分享
二维码
< <上一篇
下一篇>>