Java – Apache nutch – path problem

I tried to set Apache nutch to grab the URL and follow this guide As an old guide (this guide is 1. X, I use 2.3), I have made the necessary changes to the structure However, when I try to run a crawl, I get this error:

root@IndiStage:~# /usr/local/nutch/framework/apache-nutch-2.3/src/bin/crawl urls FirstCrawl 2
No SOLRURL specified. Skipping indexing.
Injecting seed URLs
/usr/local/nutch/framework/apache-nutch-2.3/src/bin/nutch inject urls -crawlId FirstCrawl
Error: Could not find or load main class org.apache.nutch.crawl.InjectorJob
Error running:
  /usr/local/nutch/framework/apache-nutch-2.3/src/bin/nutch inject urls -crawlId FirstCrawl
Failed with exit value 1.
root@IndiStage:~#

As a new feature of Ubuntu (14.04), it is difficult for me to manage the directory structure and path here

The injectorjob is located in / usr / local / nutch / framework / apache-nutch-2.3/src/java/org/apache/nutch/crawl

JAVA_ Home is set to / usr / lib / JVM / java-7-openjdk-amd64

Solution

Make sure you have compiled nutch source code Then, run the crawl command from ${apache_nut_home} / Runtime / local (or ${apache_nut_home} / Runtime / deploy / bin)

I hope it helps,

Le Quoc does

The content of this article comes from the network collection of netizens. It is used as a learning reference. The copyright belongs to the original author.
THE END
分享
二维码
< <上一篇
下一篇>>