java – Selenium – driver. Getpagesource () is different from the source viewed from the browser

I tried to use selenium to capture the source code from the specified HTML file, but I don't know why. I didn't get the exact source code we saw in the browser

Here is the Java code that I captured the source code in the java file

private static void getHTMLSourceFromURL(String url,String fileName) {

    WebDriver driver = new FirefoxDriver();
    driver.get(url);

    try {
        Thread.sleep(5000);   //the page gets loaded completely

        List<String> pageSource = new ArrayList<String>(Arrays.asList(driver.getPageSource().split("\n")));

        writeTextToFile(pageSource,originalFile);

    } catch (InterruptedException e) {
        e.printStackTrace();
    }

    System.out.println("quitting webdriver");
    driver.quit();
}

/**
 * creates file with fileName and writes the content
 * 
 * @param content
 * @param fileName
 */
private static void writeTextToFile(List<String> content,String fileName) {
    PrintWriter pw = null;
    String outputFolder = ".";
    File output = null;
    try {
        File dir = new File(outputFolder + '/' + "HTML Sources");
        if (!dir.exists()) {
            boolean success = dir.mkdirs();
            if (success == false) {
                try {
                    throw new Exception(dir + " Could not be created");
                } catch (Exception e) {
                    e.printStackTrace();
                }
            }
        }

        output = new File(dir + "/" + fileName);
        if (!output.exists()) {
            try {
                output.createNewFile();
            } catch (IOException ioe) {
                ioe.printStackTrace();
            }
        }
        pw = new PrintWriter(new FileWriter(output,true));
        for (String line : content) {
            pw.print(line);
            pw.print("\n");
        }
    } catch (IOException ioe) {
        ioe.printStackTrace();
    } finally {
        pw.close();
    }

}

Can anyone explain why this happened? How does webdriver render pages? How does the browser display the source code?

Solution

There are several sources from which you can draw You can try

String pageSource=driver.findElement(By.tagName("body")).getText();

See what happens

Usually, you don't have to wait for the page to load Selenium will do this automatically unless you have a separate JavaScript / Ajax section

You may want to add the differences you see so that we can understand what you really mean

Webdriver does not render the page itself, it just renders it when the browser sees it

The content of this article comes from the network collection of netizens. It is used as a learning reference. The copyright belongs to the original author.
THE END
分享
二维码
< <上一篇
下一篇>>