Java – regular expressions and escaped and non escaped separators

Problems related to this

I have a string

a\;b\\;c;d

Looks like in Java

String s = "a\\;b\\\\;c;d"

I need to split it with a semicolon according to the following rules:

>If a semicolon is preceded by a backslash, it should not be considered a separator (between a and b). > If the backslash itself is escaped and therefore not escaped as a semicolon, the semicolon should be a separator (between B and C)

Therefore, if there are zero or even backslashes before it, the semicolon should be regarded as a separator

For example, above, I want to get the following string (double backslash of java compiler):

a\;b\\
c
d

Solution

You can use regular expressions

(?:\\.|[^;\\]++)*

Match all text between non escaped semicolons:

List<String> matchList = new ArrayList<String>();
try {
    Pattern regex = Pattern.compile("(?:\\\\.|[^;\\\\]++)*");
    Matcher regexMatcher = regex.matcher(subjectString);
    while (regexMatcher.find()) {
        matchList.add(regexMatcher.group());
    }

explain:

(?:        # Match either...
 \\.       # any escaped character
|          # or...
 [^;\\]++  # any character(s) except semicolon or backslash; possessive match
)*         # Repeat any number of times.

Because of nested quantifiers, possessive matching () is very important to avoid catastrophic backtracking

The content of this article comes from the network collection of netizens. It is used as a learning reference. The copyright belongs to the original author.
THE END
分享
二维码
< <上一篇
下一篇>>