Data acquisition based on Java (I)

Previously, I wrote 2 articles on PHP data collection and warehousing:

Data collection and warehousing based on PHP (I): http://www.cnblogs.com/lichenwei/p/3872307.html

Data collection and warehousing based on PHP (II): http://www.cnblogs.com/lichenwei/p/3873281.html

Java based data collection (II): http://www.cnblogs.com/lichenwei/p/3905370.html

Data collection and warehousing based on Java (III): http://www.cnblogs.com/lichenwei/p/3907007.html

Data collection and warehousing based on Java (final part): http://www.cnblogs.com/lichenwei/p/3910492.html

In fact, the principle of collection is the same: remote access to information - > extract the required content (regular) - > classified storage - > read - > display

It doesn't matter what programming language you use. Programming language is just a tool

This time, let's collect data from a football website: http://www.footballresults.org/league.php?league=EngDiv1

The following figure shows the data we want to collect:

OK, let's look at the above two articles on the acquisition principle. The rest are directly related to the code:

GerData. Java (data collection method encapsulation)

In fact, it is a simple matching rule:

Group (): returns the input subsequence captured by the given group during the previous matching operation.

Find (): attempts to find the next subsequence of the input sequence that matches the pattern.

 1 package com.lcw.curl;

CurlMain. Java (main program)

Inputstreamreader () is a bridge between byte flow and character flow.

Inputstreamreader () is a bridge between byte flow and character flow.

Openstream () opens the connection to this URL and returns a byte stream for reading from the connection.

Data collection is easy, and the effect is shown in the figure below:

The content of this article comes from the network collection of netizens. It is used as a learning reference. The copyright belongs to the original author.
THE END
分享
二维码
< <上一篇
下一篇>>