Java – jsoup: Retrieves elements that do not contain specific attributes

2020-09-02 • Java

I have a table with follow logic

>Table display name list > for each row containing < tr class = hiderow > < TD class = packagename >... < / td > < / TR > – > this row will not be visible

Therefore, the table may contain 100 rows, but if 20 rows contain class = hiderow, the user can only see 80 rows on the page I want to retrieve the names of those 80 lines (not 100) So I need to parse the data that does not contain class = hiderow I know how to use jsoup to get each name. I also see that there are elements in the document that do not match the selector But I don't know how to use it Please help.

Editor: I've figured out how to do this If there is a better way, please let me know Edit2 please use balusc's following solutions It's cleaner

public void obtainPackageName(String urlLink) throws IOException{
    List<String> pdfList = new ArrayList<String>();
    URL url = new URL(urlLink);
    Document doc = Jsoup.parse(url,3000);
    Element table = doc.select("table[id=mastertableid]").first();
    Iterator<Element> rowIter = table.select("tr").iterator();
    while(rowIter.hasNext()){
        Element row = rowIter.next();
        if(!row.className().contains("hiderow")){
            Element packageName = row.select("td[class=packagename]").first();
            if(packageName != null){
                pdfList.add(packageName.text());
            }

        }
    }
}

Solution

You need to apply: not() to the element of interest (TR in your case), and then pass the CSS selector relative to the element to the element that should not match (hiderow in your case)

Therefore, this should be done:

Document document = Jsoup.connect(urlLink).get();
Elements packagenames = document.select("#mastertableid tr:not(.hiderow) td.packagename");
List<String> pdfList = new ArrayList<String>();

for (Element packagename : packagenames) {
    pdfList.add(packagename.text()); 
}

The content of this article comes from the network collection of netizens. It is used as a learning reference. The copyright belongs to the original author.

THE END

Java

二维码

Java renders XML documents as PDF

< <上一篇

Java – private access with self-contained generics

下一篇>>

搜索内容

Java – jsoup: Retrieves elements that do not contain specific attributes

Solution

热门文章