Java – regular expressions are slow. How to check whether a string is only word characters fast?

I have a function to check that a string (most strings have only one CJK character) has only word characters. It will be called many times, so the cost is unacceptable, but I don't know how to optimize it. Any suggestions?

/*\w is equivalent to the character class [\p{Ll}\p{Lu}\p{Lt}\p{Lo}\p{Nd}].
 For more details see Unicode TR-18,and bear in mind that the set of characters
 in each class can vary between Unicode releases.*/
private static final Pattern sOnlyWordChars = Pattern.compile("\\w+");

private boolean isOnlyWordChars(String s) {
    return sOnlyWordChars.matcher(s).matches();
}

When s is "3G" or "go_url" or "hao123", isonlywordchars (s) should return true

Solution

private boolean isOnlyWordChars(String s) {
private boolean isOnlyWordChars(String s) {
    char[] chars = s.tocharArray();    
    for (char c : chars) {
        if(!Character.isLetter(c)) {
            return false;
        }
    }    
    return true;
}

Better implementation

public static boolean isAlpha(String str) {
    if (str == null) {
        return false;
    }
    int sz = str.length();
    for (int i = 0; i < sz; i++) {
        if (Character.isLetter(str.charAt(i)) == false) {
            return false;
        }
    }
    return true;
}

Or, if you are using Apache commons, stringutils isAlpha(). The second implementation of the answer actually comes from the source code of isalpha

UPDATE

Sorry for your late reply I'm not sure about the speed, although I read in several places that loops are faster than regular expressions To make sure I run the following code in ideoone, the results are as follows

5000000 iterations

Use your code: 4.99 seconds (run-time error after that, so it doesn't work for big data)

Use my first code for 2.71 seconds

Use my second code for 2.52 seconds

500000 iterations

Use your code: 1.07 seconds

Use my first code for 0.36 seconds

Use my second code for 0.33 seconds

Here is the sample code I use

Note: there may be minor errors You can use it to test different scenarios According to Jan's comments, I think these are small things that use private or public Condition checking is a good idea

The content of this article comes from the network collection of netizens. It is used as a learning reference. The copyright belongs to the original author.
THE END
分享
二维码
< <上一篇
下一篇>>