Map Reduce：由于错误数量，无法运行代码答案

【问题标题】：Map Reduce: Unabale to run the code due to number of errorsMap Reduce：由于错误数量，无法运行代码
【发布时间】：2021-10-14 04:14:36
【问题描述】：

请看下面的代码

Map.java

public class Map extends Mapper<longwritable, intwritable="" text,=""> {
 private final static IntWritable one = new IntWritable(1);
 private Text word = new Text();

 @Override
 public void map(LongWritable key, Text value, Context context)
   throws IOException, InterruptedException {
  String line = value.toString();
  StringTokenizer tokenizer = new StringTokenizer(line);
  while (tokenizer.hasMoreTokens()) {
   word.set(tokenizer.nextToken());
   context.write(word, one);
  }
 }
}
</longwritable,>

Reduce.java

public class Reduce extends Reducer<text, intwritable,="" intwritable="" text,=""> {
 @Override
 protected void reduce(
   Text key,
   java.lang.Iterable<intwritable> values,
   org.apache.hadoop.mapreduce.Reducer<text, intwritable,="" intwritable="" text,="">.Context context)
   throws IOException, InterruptedException {
  int sum = 0;
  for (IntWritable value : values) {
   sum += value.get();
  }
  context.write(key, new IntWritable(sum));
 }
}
</text,></intwritable></text,>

WordCount.java

public class WordCount {

    public static void main(String[] args) throws Exception {
          Configuration conf = new Configuration();

          Job job = new Job(conf, "wordcount");
          job.setJarByClass(WordCount.class);

          job.setOutputKeyClass(Text.class);
          job.setOutputValueClass(IntWritable.class);

          job.setMapperClass(Map.class);
          job.setReducerClass(Reduce.class);

          job.setInputFormatClass(TextInputFormat.class);
          job.setOutputFormatClass(TextOutputFormat.class);

          FileInputFormat.addInputPath(job, new Path(args[0]));
          FileOutputFormat.setOutputPath(job, new Path(args[1]));

          job.waitForCompletion(true);
        }

}

整个代码摘自thisMap Reduce教程（http://cloud.dzone.com/articles/how-run-elastic-mapreduce-job）

。一旦我将这些类复制到 Eclipse 中，它就会显示很多错误，例如不能 Resolved By Type。这是合理的，因为此代码用作实例的类在默认 JDK 中找不到，并且教程没有给出任何下载任何库的说明。我忽略了它，认为它与服务器端的Elastic Map Reduce 有关。

我将它上传到 Amazon Elastic Map Reduce、创建作业流程并运行程序后，它给了我以下错误。

Exception in thread "main" java.lang.Error: Unresolved compilation problems: 
    Configuration cannot be resolved to a type
    Configuration cannot be resolved to a type
    Job cannot be resolved to a type
    Job cannot be resolved to a type
    Text cannot be resolved to a type
    IntWritable cannot be resolved to a type
    TextInputFormat cannot be resolved to a type
    TextOutputFormat cannot be resolved to a type
    FileInputFormat cannot be resolved
    Path cannot be resolved to a type
    FileOutputFormat cannot be resolved
    Path cannot be resolved to a type

    at WordCount.main(WordCount.java:5)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:606)
    at org.apache.hadoop.util.RunJar.main(RunJar.java:187)

我怎样才能使这段代码工作？我必须为此下载任何库吗？如何使此代码运行并查看结果？这是我在 Amazon 和 Elastic Map reduce 的第一次体验，是的，也是对大数据的第一次体验。

请帮忙。

【问题讨论】：

标签： java hadoop amazon-web-services amazon-ec2 mapreduce

【解决方案1】：

所以，你的意思是，你没有在你的项目中添加任何hadoop jar，并且你忽略了编译错误，并希望它可以在安装了hadoop-client的服务器端运行？

如果是真的，那是不可能的。

你必须将hadoop-client.XX.jar添加到你的项目中，任何版本都可以。

【讨论】：

感谢您的回复。这个罐子在哪里？我只复制了该代码并运行它
你可以从cloudera.com/content/support/en/downloads.html下载hadoop-client.xxx.jar，选择你要添加的版本。 apach-hadoop 也很糟糕：hadoop.apache.org/releases.html 但是，如果你想在亚马逊服务器上运行你的任务，你必须让你选择的 hadoop 版本与亚马逊服务器相同。我不知道亚马逊使用的是哪个版本，也许他们为它打开了一个api或文档的东西，你可以检查一下。

【解决方案2】：

在 Eclipse 中将所有 hadoop jar 添加到项目中，如果您的代码没有错误，那么您可以将其导出为 jar 并在 hadoop 中运行该 jar。

要添加 jar 转到“构建路径”，请选择“配置构建路径”和“添加外部 jar”。（选择所有 hadoop jar 并添加它们）

【讨论】：

【解决方案3】：

致遇到此错误的人：

您可以右键单击您创建的项目。

构建路径->配置构建路径> 在库选项卡中添加外部 jar 文件。

Hadoop Jar 位于文件系统中>usr>lib 可以的，

浏览：file syster>usr>lib>hadoop> 添加从 hadoop-annotations.jar 到最后一个 jar [parquet-tools.jar] 的所有 jar 文件强>

然后再次添加新的外部 jar，这次添加客户端文件夹中存在的所有 jar；

路径（文件系统>usr>lib>hadoop>客户端）

【讨论】：