The third secret of Android Development: parsing HTML pages with jsoup
•
Android
This section mainly explains how jsoup parses HTML pages. Because the process of Android development inevitably involves the capture, parsing and display of web pages, here I mainly show an example of capturing the topic classification of cnbeta.com website by using jsoup jar package.
The following is the main code. Due to its simple use, I won't say more here:
package com.android.web; import java.io.BufferedInputStream; import java.io.IOException; import java.io.InputStream; import java.net.MalformedURLException; import java.net.URL; import java.net.URLConnection; import java.util.ArrayList; import java.util.HashMap; import java.util.List; import java.util.Map; import org.apache.http.util.ByteArrayBuffer; import org.apache.http.util.EncodingUtils; import org.jsoup.Jsoup; import org.jsoup.nodes.Document; import org.jsoup.nodes.Element; import org.jsoup.select.Elements; import android.app.Activity; import android.os.Bundle; import android.view.View; import android.view.View.OnClickListener; import android.widget.ListView; import android.widget.SimpleAdapter; public class _GetWebResoureActivity extends Activity { Document doc; @Override public void onCreate(Bundle savedInstanceState) { super.onCreate(savedInstanceState); setContentView(R.layout.main); findViewById(R.id.button1).setOnClickListener(new OnClickListener() { @Override public void onClick(View v) { load(); } }); } protected void load() { try { doc = Jsoup.parse(new URL("http://www.cnbeta.com"), 5000); } catch (MalformedURLException e1) { e1.printStackTrace(); } catch (IOException e1) { e1.printStackTrace(); } List<Map<String, String>> list = new ArrayList<Map<String, String>>(); Elements es = doc.getElementsByClass("main_navi"); for (Element e : es) { Map<String, String> map = new HashMap<String, String>(); map.put("title", e.getElementsByTag("a").text()); map.put("href", "http://www.cnbeta.com" + e.getElementsByTag("a").attr("href")); list.add(map); } ListView listView = (ListView) findViewById(R.id.listView1); listView.setAdapter(new SimpleAdapter(this, list, android.R.layout.simple_list_item_2, new String[] { "title","href" }, new int[] { android.R.id.text1,android.R.id.text2 })); } /** * @param urlString * @return */ public String getHtmlString(String urlString) { try { URL url = null; url = new URL(urlString); URLConnection ucon = null; ucon = url.openConnection(); InputStream instr = null; instr = ucon.getInputStream(); BufferedInputStream bis = new BufferedInputStream(instr); ByteArrayBuffer baf = new ByteArrayBuffer(500); int current = 0; while ((current = bis.read()) != -1) { baf.append((byte) current); } return EncodingUtils.getString(baf.toByteArray(), "gbk"); } catch (Exception e) { return ""; } } }
Pay attention to the yellow part in the code. You must find the right position to get the correct result. Here are the main preview effects:
The content of this article comes from the network collection of netizens. It is used as a learning reference. The copyright belongs to the original author.
THE END
二维码