Java – too many open files (selenium phantom jsdriver)

In my embedded selenium / phantom JS driver, it seems that resources have not been cleaned up Running the client synchronously will result in millions of open files, and eventually lead to the "open too many files" type exception

When the program was running for about a minute, I collected some output from lsof

$lsof | awk '{ print $2; }' | uniq -c | sort -rn | head
    1221966 12180
      34790 29773
      31260 12138
      20955 8414
      17940 10343
      16665 32332
       9512 27713
       7275 19226
       5496 7153
       5040 14065

$lsof -p 12180 | awk '{ print $2; }' | uniq -c | sort -rn | head
    2859 12180
       1 PID

$lsof -p 12180 -Fn | sort -rn | uniq -c | sort -rn | head
    1124 npipe
     536 nanon_inode
       4 nsocket
       3 n/opt/jdk/jdk1.8.0_60/jre/lib/jce.jar
       3 n/opt/jdk/jdk1.8.0_60/jre/lib/charsets.jar
       3 n/dev/urandom
       3 n/dev/random
       3 n/dev/pts/20
       2 n/usr/share/sbt-launcher-packaging/bin/sbt-launch.jar
       2 n/usr/share/java/jayatana.jar

I don't understand why the result set using the - P flag on lsof is smaller However, it seems that most entries are pipes and anons_ inode.

The client is very simple in line ~ 100, and calls the driver at the end of use Close() and driver quit(). I tried to cache and reuse the client, but it didn't reduce the open files

case class HeadlessClient(
                           country: String,userAgent: String,inheritSessionId: Option[Int] = None
                         ) {
  protected var numberOfRequests: Int = 0
  protected val proxySessionId: Int = inheritSessionId.getOrElse(new Random().nextInt(Integer.MAX_VALUE))
  protected val address = InetAddress.getByName("proxy.domain.com")
  protected val host = address.getHostAddress
  protected val login: String = HeadlessClient.username + proxySessionId
  protected val windowSize = new org.openqa.selenium.Dimension(375,667)

  protected val (mobProxy,seleniumProxy) = {

    val proxy = new BrowserMobProxyServer()
    proxy.setTrustAllServers(true)
    proxy.setChainedProxy(new InetSocketAddress(host,HeadlessClient.port))
    proxy.chainedProxyAuthorization(login,HeadlessClient.password,AuthType.BASIC)
    proxy.addLastHttpFilterFactory(new HttpFilteRSSourceAdapter() {
      override def filterRequest(originalRequest: HttpRequest): HttpFilters = {
        new HttpFiltersAdapter(originalRequest) {
          override def proxyToServerRequest(httpObject: HttpObject): io.netty.handler.codec.http.HttpResponse = {
            httpObject match {
              case req: HttpRequest => req.headers().remove(HttpHeaders.Names.VIA)
              case _ =>
            }
            null
          }
        }
      }
    })
    proxy.enableHarCaptureTypes(CaptureType.REQUEST_CONTENT,CaptureType.RESPONSE_CONTENT)
    proxy.start(0)
    val seleniumProxy = ClientUtil.createSeleniumProxy(proxy)
    (proxy,seleniumProxy)
  }

  protected val driver: PhantomJSDriver = {
    val capabilities: DesiredCapabilities = DesiredCapabilities.chrome()
    val cliArgsCap = new util.ArrayList[String]
    cliArgsCap.add("--webdriver-loglevel=NONE")
    cliArgsCap.add("--ignore-ssl-errors=yes")
    cliArgsCap.add("--load-images=no")

    capabilities.setCapability(CapabilityType.PROXY,seleniumProxy)
    capabilities.setCapability("phantomjs.page.customHeaders.Referer","")
    capabilities.setCapability("phantomjs.page.settings.userAgent",userAgent)
    capabilities.setCapability(PhantomJSDriverService.PHANTOMJS_CLI_ARGS,cliArgsCap)

    new PhantomJSDriver(capabilities)
  }

  driver.executePhantomJS(
    """
      |var navigation = [];
      |
      |this.onNavigationRequested = function(url,type,willNavigate,main) {
      |  navigation.push(url)
      |  console.log('Trying to navigate to: ' + url);
      |}
      |
      |this.onResourceRequested = function(request,net) {
      |    console.log("Requesting " + request.url);
      |    if (! (navigation.indexOf(request.url) > -1)) {
      |        console.log("Aborting " + request.url)
      |        net.abort();
      |    }
      |};
    """.stripMargin
  )

  driver.manage().window().setSize(windowSize)

  def follow(url: String)(implicit ec: ExecutionContext): List[HarEntry] = {
    try{
      Await.result(Future{
        mobProxy.newHar(url)
        driver.get(url)
        val entries = mobProxy.getHar.getLog.getEntries.asScala.toList
        shutdown()
        entries
      },45.seconds)
    } catch {
      case e: Exception =>
        try {
          shutdown()
        } catch {
          case shutdown: Exception =>
            throw new Exception(s"Error ${shutdown.getMessage} cleaning up after Exception: ${e.getMessage}")
        }

        throw e
    }
  }

  def shutdown() = {
    driver.close()
    driver.quit()
  }
}

I tried several versions of selenium just in case there was a bug fix build. sbt:

libraryDependencies += "org.seleniumhq.selenium" % "selenium-java"   % "3.0.1"
libraryDependencies += "net.lightbody.bmp" % "browsermob-core" % "2.1.2"

In addition, I tried phantom JS 2.0 1 and 2.1 1:

$phantomjs --version
  2.0.1-development

$phantomjs --version
  2.1.1

Is this phantom JS or selenium? Is my client using the API improperly?

Solution

Resource usage is caused by browsermob To close the agent and clean up resources, you must call stop()

For this client, this means modifying the shutdown method

def shutdown() = {
  mobProxy.stop()
  driver.close()
  driver.quit()
}

Another way to abort is to terminate the proxy server immediately without waiting for traffic to stop

The content of this article comes from the network collection of netizens. It is used as a learning reference. The copyright belongs to the original author.
THE END
分享
二维码
< <上一篇
下一篇>>