Java – how to determine whether a string is an English sentence or code?
Consider the following two strings. The first is the code and the second is the English sentence (the phrase is accurate) How to detect that the first is code and the second is not code
1. for (int i = 0; i < b.size(); i++) { 2. do something in English (not necessary to be a sentence).
I am considering calculating special characters (e.g. "=", ";", "" etc.) and set whether a threshold is reached Is there a better way to do this? Any Java libraries?
Note that the code may not parse because it is not a complete method / statement / expression
My assumption is that the English sentence is very regular. It is likely to contain only ",", "", “_”, "(", ")" and so on They do not contain such things: Writing ("a lot of text as a whole");
Solution
The basic idea is to convert a string to a tag For example, the above code line may change to "key, separator, ID, assign, number,..." Then we can use simple rules to separate the code from English
check out the code here