【问题标题】:Taking too much time in indexing while integrating nutch 2.3, Hbase and Solr在集成 nutch 2.3、Hbase 和 Solr 时花费太多时间进行索引
【发布时间】:2016-05-18 11:12:55
【问题描述】:

我正在集成 Nutch、Hbase 和 Solr

我配置了 Nutch、Hbase 和 Solr,并且还为 Crawling the Websites 做了操作,但同时按照这个将 Nutch 与 Solr 集成 Integrating Nutch 2.3, HBase and Solr,我执行了命令 /opt/solr-4.8.1/examples 中的 java jar start.jar

进程已启动,但执行需要花费大约 10 天的时间,现在它仍在运行。

我无法找出它出了什么问题。 任何人都可以提出什么问题以及如何解决。

以下是日志文件的一些细节。

INFO  - 2016-05-18 15:58:00.286; org.apache.solr.update.DirectUpdateHandler2; start commit{,optimize=true,openSearcher=true,waitSearcher=true,expungeDeletes=false,softCommit=false,prepareCommit=false}
INFO  - 2016-05-18 15:58:00.287; org.apache.solr.update.DirectUpdateHandler2; No uncommitted changes. Skipping IW.commit.
INFO  - 2016-05-18 15:58:00.287; org.apache.solr.core.SolrCore; SolrIndexSearcher has not changed - not re-opening: org.apache.solr.search.SolrIndexSearcher
INFO  - 2016-05-18 15:58:00.288; org.apache.solr.update.DirectUpdateHandler2; end_commit_flush
INFO  - 2016-05-18 15:58:00.288; org.apache.solr.update.processor.LogUpdateProcessor; [collection1] webapp=/solr path=/update params={waitFlush=true&optimize=true&wt=json&_=1463567280272} {optimize=} 0 2
INFO  - 2016-05-18 15:58:01.976; org.apache.solr.update.DirectUpdateHandler2; start commit{,optimize=true,openSearcher=true,waitSearcher=true,expungeDeletes=false,softCommit=false,prepareCommit=false}
INFO  - 2016-05-18 15:58:01.976; org.apache.solr.update.DirectUpdateHandler2; No uncommitted changes. Skipping IW.commit.
INFO  - 2016-05-18 15:58:01.977; org.apache.solr.core.SolrCore; SolrIndexSearcher has not changed - not re-opening: org.apache.solr.search.SolrIndexSearcher
INFO  - 2016-05-18 15:58:01.977; org.apache.solr.update.DirectUpdateHandler2; end_commit_flush
INFO  - 2016-05-18 15:58:01.978; org.apache.solr.update.processor.LogUpdateProcessor; [collection1] webapp=/solr path=/update params={waitFlush=true&optimize=true&wt=json&_=1463567281965} {optimize=} 0 2
INFO  - 2016-05-18 15:58:05.799; org.apache.solr.servlet.SolrDispatchFilter; [admin] webapp=null path=/admin/info/threads params={wt=json&_=1463567285780} status=0 QTime=8 
INFO  - 2016-05-18 15:58:09.267; org.apache.solr.servlet.SolrDispatchFilter; [admin] webapp=null path=/admin/info/properties params={wt=json&_=1463567289183} status=0 QTime=0 
INFO  - 2016-05-18 15:58:11.225; org.apache.solr.servlet.SolrDispatchFilter; [admin] webapp=null path=/admin/cores params={wt=json&_=1463567291213} status=0 QTime=1 
INFO  - 2016-05-18 15:58:11.260; org.apache.solr.servlet.SolrDispatchFilter; [admin] webapp=null path=/admin/cores params={wt=json&_=1463567291242} status=0 QTime=1 
INFO  - 2016-05-18 15:58:13.808; org.apache.solr.core.SolrCore; [collection1] webapp=/solr path=/admin/luke params={show=index&numTerms=0&wt=json&_=1463567293791} status=0 QTime=1 
INFO  - 2016-05-18 15:58:13.821; org.apache.solr.core.SolrCore; [collection1] webapp=/solr path=/replication params={wt=json&command=details&_=1463567293794} status=0 QTime=1 
INFO  - 2016-05-18 15:58:13.837; org.apache.solr.core.SolrCore; [collection1] webapp=/solr path=/admin/system params={wt=json&_=1463567293796} status=0 QTime=4 
INFO  - 2016-05-18 15:58:13.845; org.apache.solr.core.SolrCore; [collection1] webapp=/solr path=/admin/file/ params={file=admin-extra.html&_=1463567293798} status=0 QTime=0 
INFO  - 2016-05-18 15:58:13.856; org.apache.solr.core.SolrCore; [collection1] webapp=/solr path=/admin/ping params={action=status&wt=json&_=1463567293801} status=503 QTime=1 
INFO  - 2016-05-18 16:54:35.235; org.apache.solr.servlet.SolrDispatchFilter; [admin] webapp=null path=/admin/info/logging params={wt=json&since=0&_=1463570675193} status=0 QTime=1 
INFO  - 2016-05-18 16:54:38.820; org.apache.solr.core.SolrCore; [collection1] webapp=/solr path=/replication params={wt=json&command=details&_=1463570678769} status=0 QTime=0 
INFO  - 2016-05-18 16:54:38.821; org.apache.solr.core.SolrCore; [collection1] webapp=/solr path=/admin/luke params={show=index&numTerms=0&wt=json&_=1463570678764} status=0 QTime=2 
INFO  - 2016-05-18 16:54:38.823; org.apache.solr.core.SolrCore; [collection1] webapp=/solr path=/admin/ping params={action=status&wt=json&_=1463570678776} status=503 QTime=0 
INFO  - 2016-05-18 16:54:38.829; org.apache.solr.core.SolrCore; [collection1] webapp=/solr path=/admin/file/ params={file=admin-extra.html&_=1463570678774} status=0 QTime=1 
INFO  - 2016-05-18 16:54:38.831; org.apache.solr.core.SolrCore; [collection1] webapp=/solr path=/admin/system params={wt=json&_=1463570678772} status=0 QTime=11 
INFO  - 2016-05-18 16:54:46.728; org.apache.solr.core.SolrCore; [collection1] webapp=/solr path=/admin/mbeans params={stats=true&wt=json&_=1463570686705} status=0 QTime=5 
INFO  - 2016-05-18 16:54:49.533; org.apache.solr.core.SolrCore; [collection1] webapp=/solr path=/admin/mbeans params={stats=true&wt=json&_=1463570689477} status=0 QTime=3 
INFO  - 2016-05-18 16:54:52.762; org.apache.solr.core.SolrCore; [collection1] webapp=/solr path=/replication params={wt=json&command=details&_=1463570692692} status=0 QTime=0 
INFO  - 2016-05-18 16:56:33.180; org.apache.solr.servlet.SolrDispatchFilter; [admin] webapp=null path=/admin/info/logging params={wt=json&since=0&_=1463570793166} status=0 QTime=0 
INFO  - 2016-05-18 16:56:38.195; org.apache.solr.core.SolrCore; [collection1] webapp=/solr path=/admin/luke params={show=index&numTerms=0&wt=json&_=1463570798128} status=0 QTime=0 
INFO  - 2016-05-18 16:56:38.198; org.apache.solr.core.SolrCore; [collection1] webapp=/solr path=/replication params={wt=json&command=details&_=1463570798132} status=0 QTime=0 
INFO  - 2016-05-18 16:56:38.199; org.apache.solr.core.SolrCore; [collection1] webapp=/solr path=/admin/ping params={action=status&wt=json&_=1463570798137} status=503 QTime=0 
INFO  - 2016-05-18 16:56:38.201; org.apache.solr.core.SolrCore; [collection1] webapp=/solr path=/admin/file/ params={file=admin-extra.html&_=1463570798135} status=0 QTime=0 
INFO  - 2016-05-18 16:56:38.211; org.apache.solr.core.SolrCore; [collection1] webapp=/solr path=/admin/system params={wt=json&_=1463570798133} status=0 QTime=12 

【问题讨论】:

  • 哇,你怎么能等上10天……
  • 我开始从事其他工作并保持集成过程不变。
  • 你可以查看solr的日志看看是怎么回事。
  • 是的,我检查了日志,每次 Solr 执行一些操作,例如 org.apache.solr.core.SolrCore; [collection1] webapp=/solr path=/admin/system params={wt=json&_=1463570798133} status=0 QTime=12

标签: java hadoop solr lucene nutch


【解决方案1】:

最后,我解决了。 基本上,java -jar start.jar是下载jar文件的,所以这里不做索引,而是下载Solr 4.8 jars,然后配置它。由于性能原因,我用Solr 5.2.1替换了Solr 4.8现在 Solr 工作正常。

【讨论】:

    猜你喜欢
    • 2011-12-06
    • 2016-09-24
    • 1970-01-01
    • 1970-01-01
    • 2019-11-14
    • 2016-05-07
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    相关资源
    最近更新 更多