Java application open source framework to realize simple web search engine

2019-10-08 • Java

introduction

Using the open source library of Java, write a search engine that can crawl the content of a website. And according to the web content, we can crawl deeply to obtain all relevant web addresses and contents. Users can search all relevant web sites through keywords.

Specific functions

(1) Users can specify the content of the web page corresponding to a URL. (2) Parse the web page content and get all the URL link addresses. (3) The user can set the crawl depth, which means that starting from the page corresponding to the initial URL, the user can crawl the URL in the web page corresponding to all the URLs, and so on. The greater the depth, the more websites you can climb. (4) Save and index the crawled URL content. The content of the index is the URL address itself and the page title corresponding to the URL. (5) Users can search the web address through keywords to find the URL address with the keyword. (6) The process of establishing index and searching index can intelligently identify Chinese keywords and segment keywords. (7) Users can specify the address where the index is saved, the initial URL, the crawl depth, the keywords to search, and the maximum matches.

Open source framework

Source code

Crawler part: spider java

Build index: buildindex java

Search index

UI interface (here, for convenience, it is only in the form of command line, and a GUI interface can be written according to requirements)

The above is the whole content of this article. I hope it will be helpful to your study, and I hope you can support programming tips.

The content of this article comes from the network collection of netizens. It is used as a learning reference. The copyright belongs to the original author.

THE END

Java

二维码

Android recyclerview sideslip menu, slide delete, long press and drag, pull-down refresh, pull-up load

< <上一篇

Android listview does not load data when sliding and loads data when stopping

下一篇>>

搜索内容

Java application open source framework to realize simple web search engine

热门文章