【发布时间】:2015-04-16 19:55:10
【问题描述】:
我正在尝试使用 hadoop map reduce,但我不想在 Mapper 中一次映射每一行,而是想一次映射整个文件。
所以我找到了这两个类 (https://code.google.com/p/hadoop-course/source/browse/HadoopSamples/src/main/java/mr/wholeFile/?r=3) 这应该可以帮助我做到这一点。
我得到一个编译错误,上面写着:
类型中的方法setInputFormat(Class) JobConf 不适用于参数 (类) Driver.java /ex2/src 第 33 行 Java 问题
我将 Driver 类更改为
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapred.FileInputFormat;
import org.apache.hadoop.mapred.FileOutputFormat;
import org.apache.hadoop.mapred.InputFormat;
import org.apache.hadoop.mapred.JobClient;
import org.apache.hadoop.mapred.JobConf;
import org.apache.hadoop.mapred.TextInputFormat;
import org.apache.hadoop.mapred.TextOutputFormat;
import forma.WholeFileInputFormat;
/*
* Driver
* The Driver class is responsible of creating the job and commiting it.
*/
public class Driver {
public static void main(String[] args) throws Exception {
JobConf conf = new JobConf(Driver.class);
conf.setJobName("Get minimun for each month");
conf.setOutputKeyClass(IntWritable.class);
conf.setOutputValueClass(IntWritable.class);
conf.setMapperClass(Map.class);
conf.setCombinerClass(Reduce.class);
conf.setReducerClass(Reduce.class);
// previous it was
// conf.setInputFormat(TextInputFormat.class);
// And it was changed it to :
conf.setInputFormat(WholeFileInputFormat.class);
conf.setOutputFormat(TextOutputFormat.class);
FileInputFormat.setInputPaths(conf,new Path("input"));
FileOutputFormat.setOutputPath(conf,new Path("output"));
System.out.println("Starting Job...");
JobClient.runJob(conf);
System.out.println("Job Done!");
}
}
我做错了什么?
【问题讨论】:
-
将pastebin的内容放到问题中。