Java – efficiency based on regular expression substitution

2020-08-02 • Java

Which of the following is more effective and useful?

value.replaceAll("['‘’`]","")

value.replaceAll("['‘’`]+","")

My guess is that for strings without replaced strings, or at least without their sequences, the two are the same, or the first better is less complex

But what if I'm looking at the string where the string is replaced? Would the second be better?

'abababababababab'.replaceAll("ab","")

V.S.

'abababababababab'.replaceAll("(ab)+","")

If this is important for this problem, I am using Java

Solution

Based on the analysis, I would say that the first option is faster than the second Although I must say that this difference is not easy to measure unless you have a huge string as input (or complex regular expression)

So we call it regex1:

'abababababababab'.replaceAll("ab","")

This regular expression 2:

'abababababababab'.replaceAll("(ab)+","")

We know from the Java API that replaceall will see both the conditions as a regex and try to replace the string after the regular expression engine

We can see that regex1 has only char sequence; Regex2 has a group, a char sequence and a quantizer metacharacters that must be interpreted accordingly (more information here) Therefore, regex2 needs more processing than regex1

In general, both options are very fast for most purposes. By reading this article, you can get a more detailed view of the process: regular expression matching can be simple and fast

Nevertheless, using more complex regular expressions with pattern and matcher is a faster option... (more information here)

In addition, another book I recommend in this scenario is: optimizing regular expressions in Java

The content of this article comes from the network collection of netizens. It is used as a learning reference. The copyright belongs to the original author.

THE END

Java