【发布时间】:2014-12-06 22:01:43
【问题描述】:
我在执行我的 mapreduce 作业时遇到问题。作为我的 map reduce 任务的一部分,我使用了 mapreduce 连接,其中包括多个 map 方法和单个 reducer 方法。
我的两个 map 方法都在执行,但我的 reducer 没有从我的驱动程序类中执行/调用。
因此,最终输出只有在我的地图阶段收集的数据。
我在减少阶段是否使用了错误的输入和输出值? map 和 reduce 阶段是否有输入输出不匹配?
在这方面帮助我。
这是我的代码..
public class CompareInputTest extends Configured implements Tool {
public static class FirstFileInputMapperTest extends Mapper<LongWritable,Text,Text,Text>{
private Text word = new Text();
private String keyData,data,sourceTag = "S1$";
public void map(LongWritable key,Text value,Context context) throws IOException, InterruptedException{
String[] values = value.toString().split(";");
keyData = values[1];
data = values[2];
context.write(new Text(keyData), new Text(data+sourceTag));
}
}
public static class SecondFileInputMapperTest extends Mapper<LongWritable,Text,Text,Text>{
private Text word = new Text();
private String keyData,data,sourceTag = "S2$";
public void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException{
String[] values = value.toString().split(";");
keyData = values[1];
data = values[2];
context.write(new Text(keyData), new Text(data+sourceTag));
}
}
public static class CounterReducerTest extends Reducer
{
private String status1, status2;
public void reduce(Text key, Iterable<Text> values, Context context)
throws IOException, InterruptedException {
System.out.println("in reducer");
for(Text value:values)
{
String splitVals[] = currValue.split("$");
System.out.println("in reducer");
/*
* identifying the record source that corresponds to a commonkey and
* parses the values accordingly
*/
if (splitVals[0].equals("S1")) {
status1 = splitVals[1] != null ? splitVals[1].trim(): "status1";
} else if (splitVals[0].equals("S2")) {
// getting the file2 and using the same to obtain the Message
status2 = splitVals[2] != null ? splitVals[2].trim(): "status2";
}
}
context.write(key, new Text(status1+"$$$"));
}
public static void main(String[] args) throws Exception {
int res = ToolRunner.run(new Configuration(), new CompareInputTest(),
args);
System.exit(res);
}
}
public int run(String[] args) throws Exception {
Configuration conf = new Configuration();
Job job = new Job(conf, "count");
job.setJarByClass(CompareInputTest.class);
MultipleInputs.addInputPath(job,new Path(args[0]),TextInputFormat.class,FirstFileInputMapperTest.class);
MultipleInputs.addInputPath(job,new Path(args[1]),TextInputFormat.class,SecondFileInputMapperTest.class);
job.setReducerClass(CounterReducerTest.class);
//job.setNumReduceTasks(1);
job.setMapOutputKeyClass(Text.class);
job.setMapOutputValueClass(Text.class);
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(Text.class);
FileOutputFormat.setOutputPath(job, new Path(args[2]));
return (job.waitForCompletion(true) ? 0 : 1);
}
}
【问题讨论】:
-
hadoop 的哪个版本?
标签: java hadoop mapreduce reduce