Java – htmlunit: ultra slow execution?
•
Java
I've been using htmlunit It suits my requirements, but it seems to be very slow
Goto Google page Enter some text Click on the search button Get the title of the results page Click on the first result.
code:
long t1=System.currentTimeMillis(); Logger logger=Logger.getLogger(""); logger.setLevel(Level.OFF); WebClient webClient=createWebClient(); WebRequest webReq=new WebRequest(new URL("http://google.lk")); HtmlPage googleMainPage=webClient.getPage(webReq); HtmlTextInput searchTextField=(HtmlTextInput) googleMainPage.getByXPath("//input[@name='q']").get(0); HtmlButton searchButton=(HtmlButton) googleMainPage.getByXPath("//button[@name='btnK']").get(0); searchTextField.type("Sri Lanka"); System.out.println("Text typed!"); HtmlPage googleResultsPage= searchButton.click(); System.out.println("Search button clicked!"); System.out.println("Title : " + googleResultsPage.getTitleText()); HtmlAnchor firstResultLink=(HtmlAnchor) googleResultsPage.getByXPath("//a[@class='l']").get(0); HtmlPage firstResultPage=firstResultLink.click(); System.out.println("First result clicked!"); System.out.println("Title : " + firstResultPage.getTitleText()); //System.out.println(firstResultPage.asText()); long t2=System.currentTimeMillis(); long diff=t2-t1; System.out.println("Time elapsed : " + milliSecondsToHrsMinutesAndSeconds(diff)); webClient.closeAllWindows();
It works 100% well But it takes three minutes and 41 seconds
I guess the reason for the slow execution is to validate every element on the page
My question is how to reduce the execution time of htmlunit? Is there any way to disable validation on Web pages
Thank you in advance!
Solution
For the current htmlunit 2.13, the setting options are slightly different from those provided by maxmax:
final WebClient webClient = new WebClient(BrowserVersion.CHROME); webClient.getOptions().setCssEnabled(false);//if you don't need css webClient.getOptions().setJavaScriptEnabled(false);//if you don't need js HtmlPage page = webClient.getPage("http://XXX.xxx.xx"); ...
In my own tests, this is 8 times the default option (note that this may depend on the web page)
The content of this article comes from the network collection of netizens. It is used as a learning reference. The copyright belongs to the original author.
THE END
二维码