Java – a regular expression used to split a German address into its parts

Good evening,

I tried to split the parts of the German address string into its parts through Java Does anyone know regular expressions or libraries do this? To split it, look like this:

Namederstra ß e25a88489 teststadt to namederstra ß e|25a | 88489 | teststadt

or

Teststr. 3 88489 beispieort (gro ß erkreis) to teststr| 3 | 88489 | Beispielort(GroßerKreis)

If the system / regular expression is still valid, it will be perfect if parts such as zip code or city are missing

Is there a regular expression or library I can archive?

Edit: German address rules: Street: people, numbers and spaces, house number: numbers and any characters (or spaces) up to a series of numbers (zip) (at least in these examples) zip code: 5 digits, place or city: the rest may also have spaces, commas or parentheses

Solution

I encountered a similar problem, slightly adjusted the solution provided here, and found that this solution can also work, but (IMO) is a little easy to understand and expand:

/^([a-zäöüß\s\d.,-]+?)\s*([\d\s]+(?:\s?[-|+/]\s?\d+)?\s*[a-z]?)?\s*(\d{5})\s*(.+)?$/i

Here are some example matches

It can also handle missing street numbers and can be easily extended by adding special characters to character classes

[a-zäöüß\s\d,.-]+?                         # Street name (lazy)
[\d\s]+(?:\s?[-|+/]\s?\d+)?\s*[a-z]?)?     # Street number (optional)

After that, there must be a zip code, which is absolutely necessary because it is the only constant part Everything after the zip code is regarded as the city name

The content of this article comes from the network collection of netizens. It is used as a learning reference. The copyright belongs to the original author.
THE END
分享
二维码
< <上一篇
下一篇>>