Java collationkey sorting error
I'm having trouble comparing strings I want to compare two French texts such as "é D" and "EF"
Collator localeSpecificCollator = Collator.getInstance(Locale.FRANCE); CollationKey a = localeSpecificCollator.getCollationKey("éd"); CollationKey b = localeSpecificCollator.getCollationKey("ef"); System.out.println(a.compareTo(b));
This will print - 1, but appears before é in the French letter E But when we only compare E and E
Collator localeSpecificCollator = Collator.getInstance(Locale.FRANCE); CollationKey a = localeSpecificCollator.getCollationKey("é"); CollationKey b = localeSpecificCollator.getCollationKey("e"); System.out.println(a.compareTo(b));
The result is 1 Can you tell us what's wrong with the first part of the code?
Solution
This seems to be the expected behavior, and it seems to be the correct way to sort alphabetically in French
Android Javadoc provides a hint about why it behaves - I think the implementation details in Android are similar to the standard JDK, if not the same:
In other words, because your 2 strings can only be sorted by looking at the major differences (excluding Accents), the collation will not check for other differences
It appears to conform to the Unicode collision algorithm (UCA):
According to Wikipedia article on "ordre alphabetique", it also seems to be the correct way to sort alphabetically in French:
In English: order initially ignores stress and cases - if 2 words cannot be sorted in this way, stress and case will be considered