Java: string length when displaying square root using Unicode outline?

In Java, I created a string using Unicode and outline because I tried to display the square root of the number I need to know the string length of some format problems When using combined characters in Unicode, the common method of finding string length seems to fail, as shown in the following example Anyone can help me find the length of the second string of random numbers in the square root, or how to better display the square root?

String s = "\u221A"+"12";
    String t = "\u221A"+"1"+"\u0305"+"2"+"\u0305";
    System.out.println(s);
    System.out.println(t);
    System.out.println(s.length());
    System.out.println(t.length());

Thank you for your help. I can't find any relevant content on Google

Solution

They will not fail, and the length of the report string is the number of Unicode characters [*] If you need other behaviors, you need to clearly define the meaning of "string length"

When you are interested in the length of the string used for display purposes, you are usually interested in calculating pixels (or other logical / physical units), and this is the responsibility of the display layer (first, you may have different widths for different characters if the font is not of equal width)

However, if you just want to count the number of graphemes ("this is an extremely unique writing unit in the context of a specific writing system"), here is a good guide, including code and examples Copy – trim – paste relevant code from there. We have such things:

public static int getGraphemeCount(String text) {
      int graphemeCount = 0;
      BreakIterator graphemeCounter = BreakIterator.getCharacterInstance();
      graphemeCounter.setText(text);
      while (graphemeCounter.next() != BreakIterator.DONE) 
          graphemeCount++;
      return graphemeCount;
  }

Remember: the above uses the default locale For example, a more flexible and robust method would take an explicit language loop as an argument and call breakiterator. Instead getCharacterInstance(locale)

[*] to be exact, as indicated in the comment, string Length () calculates Java characters, which are actually code units in UTF - 16 encoding This is equivalent to calculating Unicode characters only when we are within BMP

The content of this article comes from the network collection of netizens. It is used as a learning reference. The copyright belongs to the original author.
THE END
分享
二维码
< <上一篇
下一篇>>