Implementation principle of full-text retrieval based on Solr (detailed discussion)
Solr is an independent enterprise search application server, which provides an API interface similar to web service. Users can submit XML files in a certain format to the search engine server through HTTP request to generate an index; You can also make a search request through HTTP get operation and get the return result in XML / JSON format. Developed with Java 5 and based on Lucene.
Lucene is a sub project of 4 Jakarta project team of Apache Software Foundation. It is an open source full-text search engine toolkit, that is, it is not a complete full-text search engine, but a full-text search engine architecture. It provides a complete query engine, index engine and some text analysis engines (English and German).
The basic principle of Lucene full-text retrieval is consistent with the technology in the web search course taught by Guo jundaniu. It is realized by word segmentation, semantic grammar analysis, vector space model and other technologies. A detailed blog note is reproduced below: http://www.cnblogs.com/guochunguang/articles/3641008.html
1、 General remarks
according to http://lucene.apache.org/java/docs/index.html definition: