Java – HTML ASCII case insensitive ICU collator

I need to create one corresponding to https://www.w3.org/2005/xpath-functions/collation/html-ascii-case-insensitive/ Collator, that is, the case sensitivity of ASCII A-Z and A-Z characters is ignored during comparison

I tried this with the following icu4j rulebasedcollator:

final RuleBasedCollator collator =
        new RuleBasedCollator("&a=A,b=B,c=C,d=D,e=E,f=F,g=G,h=H,"
                + "i=I,j=J,k=K,l=L,m=M,n=N,o=O,p=P,q=Q,r=R,s=S,t=T,"
                + "u=U,v=V,u=U,w=W,x=X,y=Y,z=Z").freeze();

However, the following comparison seems to have failed. I hope it will succeed (i.e. return true):

final SearchIterator searchIterator = new StringSearch(
        "pu",new StringCharacterIterator("iNPut"),collator);
return searchIterator.first() >= 0;

What is missing from my rules?

Solution

>This W3C "collation" doesn't look like collator in the usual sense It is a non - sorted ASCII case - insensitive matcher I suspect that it is usually implemented in low-level code. It matches ASCII letters case insensitive, while all other characters match exactly See https://www.w3.org/TR/xpath-functions-31/#html -ascii-case-insensitive-collation

> http://userguide.icu-project.org/collation/customization > http://demo.icu-project.org/icu-bin/collation.html > http://www.unicode.org/reports/tr35/tr35-collation.html#Rules

The content of this article comes from the network collection of netizens. It is used as a learning reference. The copyright belongs to the original author.
THE END
分享
二维码
< <上一篇
下一篇>>