Reading HTML files into DOM trees using java

Is there a parser / library that can read HTML documents into DOM trees using Java? I want to use the standard DOM / XPath API provided by Java

Most libraries seem to have custom APIs to solve this task In addition, HTML converted to XML - dom does not seem to be supported by most available parsers

Any ideas or experience with a good HTML DOM parser?

Solution

Jtidy, by processing the stream as XHTML and then re parsing it with your favorite DOM implementation, or using parsedom if the limited DOM imp gives enough

Or Neko

The content of this article comes from the network collection of netizens. It is used as a learning reference. The copyright belongs to the original author.
THE END
分享
二维码
< <上一篇
下一篇>>