运行mapreduce程序的问题答案

【问题标题】：Issue in running mapreduce program运行mapreduce程序的问题
【发布时间】：2016-04-09 19:41:03
【问题描述】：

我正在尝试编写一个自定义字数统计程序。但我收到一条错误消息：“错误：java.io.IOException：映射中的键类型不匹配：预期的 org.apache.hadoop.io.IntWritable，收到的 org .apache.hadoop.io.LongWritable”，如您所见，我没有在程序的任何地方使用 LongWritable。请指导解决此问题并纠正我出错的地方。

程序：公共类customeWordCount {

public static class Map extends Mapper<Text, IntWritable, Text,IntWritable>{
    private static IntWritable count;
    private static Text wordCropped;
    public void map(IntWritable key, Text value,
            Context context)
            throws IOException,InterruptedException {
    TreeMap map = new TreeMap(String.CASE_INSENSITIVE_ORDER);
        String line = value.toString();
        StringTokenizer tokenizer = new StringTokenizer(line);


        String word = null;
        while (tokenizer.hasMoreTokens()) {
            word =tokenizer.nextToken();

        if(word.length()>=3){
             wordCropped.set(word.substring(0, 3));

            if(map.containsKey( wordCropped.toString().toLowerCase())){
                count = (IntWritable) map.get(wordCropped);


                context.write(wordCropped, (IntWritable)count);

            }
            else {
                context.write(wordCropped, (IntWritable)count);
            }
        }
            }


    }

}
public static class Reduce extends Reducer<Text,IntWritable,Text,IntWritable>{

    public void reduce(Text key, Iterable<IntWritable> values,
            Context context)
            throws IOException,InterruptedException {
        int sum=0;
        // TODO Auto-generated method stub
        for(IntWritable x: values)
        {
            sum++;

        }
        context.write(key, new IntWritable(sum));

    }

}

public static void main(String[] args) throws Exception {
    // TODO Auto-generated method stub

    //JobConf conf = new JobConf(Al.class);
    Configuration conf= new Configuration();


    //conf.setJobName("mywc");
    Job job = new Job(conf,"Custome_count");

    job.setJarByClass(customeWordCount.class);
    job.setMapperClass(Map.class);
    job.setReducerClass(Reduce.class);

    //conf.setMapperClass(Map.class);
    //conf.setReducerClass(Reduce.class);
    job.setMapOutputKeyClass(IntWritable.class);

    //Defining the output value class for the mapper

    job.setMapOutputValueClass(Text.class);

    job.setOutputKeyClass(Text.class);
    job.setOutputValueClass(IntWritable.class);

    job.setInputFormatClass(TextInputFormat.class);
    job.setOutputFormatClass(TextOutputFormat.class);



    Path outputPath = new Path(args[1]);

        //Configuring the input/output path from the filesystem into the job

    FileInputFormat.addInputPath(job, new Path(args[0]));
    FileOutputFormat.setOutputPath(job, new Path(args[1]));

        //deleting the output path automatically from hdfs so that we don't have delete it explicitly

    outputPath.getFileSystem(conf).delete(outputPath);

        //exiting the job only if the flag value becomes false

    System.exit(job.waitForCompletion(true) ? 0 : 1);
}

}

【问题讨论】：

我觉得主类有一些问题..我尝试更改 job.setMapOutputValueClass(Text.class);到 job.setMapOutputValueClass(IntWritable.class); job.setMapInputValueClass(IntWritable.class) 的类似变化；到 job.setMapInputValueClass(Text.class);但我收到错误消息“地图中的键类型不匹配：预期的 org.apache.hadoop.io.Text，收到 org.apache.hadoop.io.LongWritable”任何专家在这里帮助我。在此先感谢:)
请人帮忙解决这个问题。

标签： mapreduce

【解决方案1】：

由于您有文本文件作为输入，因此将使用的 InputFormat 是 TextInputFormat。此格式以 LongWritable 作为键，以 Text 作为值。

话虽如此，您的映射器应该声明为：

public static class Map extends Mapper<LongWritable, Text, Text,IntWritable>

并且您的地图方法签名必须更改为：

public void map(LongWritable key, Text value, Context context)
        throws IOException,InterruptedException

【讨论】：

非常感谢您的回复。我将纳入您的意见并返回... :)