JAVA Chinese word segmentation forward maximum matching method example code

preface

Dictionary based forward maximum matching algorithm (longest word first matching). The algorithm will automatically adjust the maximum length according to the dictionary file. The quality of word segmentation depends entirely on the dictionary.

The so-called dictionary forward maximum matching is to separate a section of string, in which the length of separation is limited, and then match the separated sub string with the words in the dictionary. If the matching is successful, the next round of matching will be carried out until all strings are processed. Otherwise, remove a word from the end of the sub string and then match. This is repeated.

The algorithm flow chart is as follows:

Let's mainly talk about the simple implementation of the algorithm in Chinese word segmentation. There's no more nonsense. Now let's start with the code

Sample code

You can see that the running result is: Peking University / before death / come / apply for a job/

The core of the algorithm is to search from front to back, and then find the longest dictionary word segmentation.

summary

The above is the whole content of this article. I hope the content of this article has a certain reference value for your study or work. If you have any questions, you can leave a message. Thank you for your support for programming tips.

The content of this article comes from the network collection of netizens. It is used as a learning reference. The copyright belongs to the original author.
THE END
分享
二维码
< <上一篇
下一篇>>