【发布时间】:2015-05-13 08:54:07
【问题描述】:
我正在按照the mahout site 的说明将现有文件转换为序列文件:
VectorWriter vectorWriter = SequenceFile.createWriter(filesystem,
configuration,
outfile,
LongWritable.class,
SparseVector.class);
long numDocs = vectorWriter.write(new VectorIterable(), Long.MAX_VALUE);
我已将 mahout jar 包含在我的 maven 项目中:
<dependency>
<groupId>org.apache.mahout</groupId>
<artifactId>mahout-core</artifactId>
<version>0.9</version>
</dependency>
但它不会写入文件。
我收到此错误:
Caused by: java.lang.NullPointerException
at org.apache.hadoop.io.serializer.SerializationFactory.getSerializer(SerializationFactory.java:73)
at org.apache.hadoop.io.SequenceFile$Writer.init(SequenceFile.java:963)
at org.apache.hadoop.io.SequenceFile$RecordCompressWriter.<init>(SequenceFile.java:1136)
at org.apache.hadoop.io.SequenceFile.createWriter(SequenceFile.java:397)
at org.apache.hadoop.io.SequenceFile.createWriter(SequenceFile.java:284)
at org.apache.hadoop.io.SequenceFile.createWriter(SequenceFile.java:265)
经过进一步调查,是由以下原因引起的:
Serilization class not found: java.lang.ClassNotFoundException: org.apache.hadoop.io.serializer.WritableSerialization
这表明我缺少一个罐子——有人知道是哪一个吗?
【问题讨论】:
标签: java hadoop mahout mahout-recommender sequencefile