The third secret of Android Development: parsing HTML pages with jsoup

This section mainly explains how jsoup parses HTML pages. Because the process of Android development inevitably involves the capture, parsing and display of web pages, here I mainly show an example of capturing the topic classification of cnbeta.com website by using jsoup jar package.

The following is the main code. Due to its simple use, I won't say more here:

package com.android.web;

import java.io.BufferedInputStream;
import java.io.IOException;
import java.io.InputStream;
import java.net.MalformedURLException;
import java.net.URL;
import java.net.URLConnection;
import java.util.ArrayList;
import java.util.HashMap;
import java.util.List;
import java.util.Map;

import org.apache.http.util.ByteArrayBuffer;
import org.apache.http.util.EncodingUtils;
import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;
import org.jsoup.nodes.Element;
import org.jsoup.select.Elements;

import android.app.Activity;
import android.os.Bundle;
import android.view.View;
import android.view.View.OnClickListener;
import android.widget.ListView;
import android.widget.SimpleAdapter;

public class _GetWebResoureActivity extends Activity {

	Document doc;

	@Override
	public void onCreate(Bundle savedInstanceState) {
		super.onCreate(savedInstanceState);
		setContentView(R.layout.main);

		findViewById(R.id.button1).setOnClickListener(new OnClickListener() {
			@Override
			public void onClick(View v) {
				load();
			}
		});
	}

	protected void load() {
		
		try {
			doc = Jsoup.parse(new URL("http://www.cnbeta.com"), 5000);
		} catch (MalformedURLException e1) {
			e1.printStackTrace();
		} catch (IOException e1) {
			e1.printStackTrace();
		}
		
		List<Map<String, String>> list = new ArrayList<Map<String, String>>();
		Elements es = doc.getElementsByClass("main_navi");
		for (Element e : es) {
			Map<String, String> map = new HashMap<String, String>();
			map.put("title", e.getElementsByTag("a").text());
			map.put("href", "http://www.cnbeta.com"
					+ e.getElementsByTag("a").attr("href"));
			list.add(map);
		}
		
		ListView listView = (ListView) findViewById(R.id.listView1);
		listView.setAdapter(new SimpleAdapter(this, list, android.R.layout.simple_list_item_2,
				new String[] { "title","href" }, new int[] {
				android.R.id.text1,android.R.id.text2
		}));
		
	}

	/**
	 * @param urlString
	 * @return
	 */
	public String getHtmlString(String urlString) {
		try {
			URL url = null;
			url = new URL(urlString);

			URLConnection ucon = null;
			ucon = url.openConnection();

			InputStream instr = null;
			instr = ucon.getInputStream();

			BufferedInputStream bis = new BufferedInputStream(instr);

			ByteArrayBuffer baf = new ByteArrayBuffer(500);
			int current = 0;
			while ((current = bis.read()) != -1) {
				baf.append((byte) current);
			}
			return EncodingUtils.getString(baf.toByteArray(), "gbk");
		} catch (Exception e) {
			return "";
		}
	}
}

Pay attention to the yellow part in the code. You must find the right position to get the correct result. Here are the main preview effects:

The content of this article comes from the network collection of netizens. It is used as a learning reference. The copyright belongs to the original author.
THE END
分享
二维码
< <上一篇
下一篇>>