Java – fast string search, such as startswith() is not equal to ()
I have a sequence list (a dictionary – 100k words) and many words are often searched in this list Therefore, performance is a problem I know HashSet Contains (theword) or collections Binary search (sorted list, the word) is very fast But actually I'm not looking for the whole word
What I want is let's search for "Se" and start all words with "Se" Is there a ready-made solution in Java or any library?
A better example: in a sorted list, provide a quick solution for the following operations
List. Sublist (string beginindex, string endindex) / / return interval
myWordList. subList(“ab”,“bc”);
Note: This is a very similar question, but the accepted answer is not satisfactory Overriding HashSet’s Contains Method
Solution
What you are looking for here is a data structure called 'trie':
http://en.wikipedia.org/wiki/Trie
It stores strings in a tree indexed by a prefix, where the first level of the tree contains the first character of the string, the second level contains the second character, and so on The result is that it allows you to extract a subset of a very large set of strings very quickly through prefixes