【发布时间】:2014-08-05 11:13:31
【问题描述】:
我正在学习here 的教程。我已经分别安装了 solr 和 nutch,它们都工作正常。当我必须整合它们时,问题就来了。从该站点上的早期帖子中,我了解到架构文件可能存在一些问题。正如 tut 中提到的,我将 nutch 的 schema.xml 复制到 solr 的 schema.xml 并重新启动了 solr。 solr 由于配置问题而停止。所以我只是将每个文件的内容与现有内容一起复制到另一个文件中。现在(以前也是)我收到此错误:
Indexer: starting at 2014-08-05 11:10:21
Indexer: deleting gone documents: false
Indexer: URL filtering: false
Indexer: URL normalizing: false
Active IndexWriters :
SOLRIndexWriter
solr.server.url : URL of the SOLR instance (mandatory)
solr.commit.size : buffer size when sending to SOLR (default 1000)
solr.mapping.file : name of the mapping file for fields (default solrindex-mapping.xml)
solr.auth : use authentication (default false)
solr.auth.username : use authentication (default false)
solr.auth : username for authentication
solr.auth.password : password for authentication
Indexer: java.io.IOException: Job failed!
at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1357)
at org.apache.nutch.indexer.IndexingJob.index(IndexingJob.java:114)
at org.apache.nutch.indexer.IndexingJob.run(IndexingJob.java:176)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
at org.apache.nutch.indexer.IndexingJob.main(IndexingJob.java:186)
有人可以建议应该做什么吗? 我正在使用 apache-nutch-1.8 和 solr-4.9.0 这是我的 hadoop.log 文件的样子:
2014-08-05 12:50:05,032 INFO crawl.Injector - Injector: starting at 2014-08-05 12:50:05
2014-08-05 12:50:05,033 INFO crawl.Injector - Injector: crawlDb: -dir/crawldb
2014-08-05 12:50:05,033 INFO crawl.Injector - Injector: urlDir: urls
.
.
.
.
.
2014-08-05 13:04:21,255 INFO solr.SolrIndexWriter - Indexing 1 documents
2014-08-05 13:04:21,286 WARN mapred.LocalJobRunner - job_local1310160376_0001
org.apache.solr.common.SolrException: Bad Request
Bad Request
request: http://my-solr-url:8983/solr/update?wt=javabin&version=2
at org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:430)
at org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:244)
at org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:105)
at org.apache.nutch.indexwriter.solr.SolrIndexWriter.close(SolrIndexWriter.java:155)
at org.apache.nutch.indexer.IndexWriters.close(IndexWriters.java:118)
at org.apache.nutch.indexer.IndexerOutputFormat$1.close(IndexerOutputFormat.java:44)
at org.apache.hadoop.mapred.ReduceTask$OldTrackingRecordWriter.close(ReduceTask.java:467)
at org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:535)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:421)
at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:398)
2014-08-05 13:04:21,544 ERROR indexer.IndexingJob - Indexer: java.io.IOException: Job failed!
at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1357)
at org.apache.nutch.indexer.IndexingJob.index(IndexingJob.java:114)
at org.apache.nutch.indexer.IndexingJob.run(IndexingJob.java:176)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
at org.apache.nutch.indexer.IndexingJob.main(IndexingJob.java:186)
2014-08-05 13:10:37,855 INFO crawl.Injector - Injector: starting at 2014-08-05 13:10:37
.
.
.
【问题讨论】:
-
谢谢@JayeshBhoyar 我已经添加了日志。如果真能帮上忙,不胜感激!