【发布时间】:2016-09-09 07:27:36
【问题描述】:
我是 Hadoop 新手,我想运行 MapReduce 作业。但是,我得到了 hadoop 找不到映射器类的错误。这是错误:
INFO mapred.JobClient: Task Id : attempt_201608292140_0023_m_000000_0, Status : FAILED
java.lang.RuntimeException: java.lang.ClassNotFoundException: TransMapper1
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:857)
at org.apache.hadoop.mapreduce.JobContext.getMapperClass(JobContext.java:199)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:718)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:364)
at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
我检查了我的 jar 文件的权限,没问题。这里是jar文件的权限:
-rwxrwxrwx.
这是启动 mapreduce 作业的代码:
import java.io.File;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.NullWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
public class mp{
public static void main(String[] args) throws Exception {
Job job1 = new Job();
job1.setJarByClass(mp.class);
FileInputFormat.addInputPath(job1, new Path(args[0]));
String oFolder = args[0] + "/output";
FileOutputFormat.setOutputPath(job1, new Path(oFolder));
job1.setMapperClass(TransMapper1.class);
job1.setReducerClass(TransReducer1.class);
job1.setMapOutputKeyClass(LongWritable.class);
job1.setMapOutputValueClass(DnaWritable.class);
job1.setOutputKeyClass(LongWritable.class);
job1.setOutputValueClass(Text.class);
}
}
这里是映射器类(TransMapper1):
import java.io.IOException;
import java.util.StringTokenizer;
import org.apache.hadoop.io.DoubleWritable;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Mapper;
public class TransMapper1 extends Mapper<LongWritable, Text, LongWritable, DnaWritable> {
@Override
public void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException {
String line = value.toString();
StringTokenizer tokenizer = new StringTokenizer(line);
LongWritable bamWindow = new LongWritable(Long.parseLong(tokenizer.nextToken()));
LongWritable read = new LongWritable(Long.parseLong(tokenizer.nextToken()));
LongWritable refWindow = new LongWritable(Long.parseLong(tokenizer.nextToken()));
IntWritable chr = new IntWritable(Integer.parseInt(tokenizer.nextToken()));
DoubleWritable dist = new DoubleWritable(Double.parseDouble(tokenizer.nextToken()));
DnaWritable dnaW = new DnaWritable(bamWindow,read,refWindow,chr,dist);
context.write(bamWindow,dnaW);
}
}
我正在使用以下命令编译包:
javac -classpath $MR_HADOOPJAR ${rootPath}mp/src/*.java
jar cvfm $mpJar $MR_MANIFEST ${rootPath}mp/src/*.class
这是 jar -tf mp/src/mp.jar 命令的结果:
META-INF/
META-INF/MANIFEST.MF
mnt/miczfs/tide/mp/src/DnaWritable.class
mnt/miczfs/tide/mp/src/mp.class
mnt/miczfs/tide/mp/src/TransMapper1.class
mnt/miczfs/tide/mp/src/TransMapper2.class
mnt/miczfs/tide/mp/src/TransReducer1.class
mnt/miczfs/tide/mp/src/TransReducer2.class
我正在用这个来运行这个工作:
mpJar=${rootPath}mp/src/mp.jar
mp_exec=mp
export HADOOP_CLASSPATH=$mpJar
hadoop $mp_exec <input path>
另外,我也试过这个命令:
hadoop jar $mp_exec <input path>
我把创建jar文件的方式改成了这个命令:
jar cf $mpJar $MR_MANIFEST ${rootPath}mp/src/*.class
随着这个变化,错误已经变成这样:
Exception in thread "main" java.lang.ClassNotFoundException: mp
at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:348)
at org.apache.hadoop.util.RunJar.main(RunJar.java:153)
所以,之前我的问题是程序找不到映射器类,现在它找不到主类!!!有什么想法吗??
谢谢大家
【问题讨论】:
-
实际上你得到的错误来自一个缺少的类@java.lang.ClassNotFoundException: TransMapper1,你确定你把所有的java源编译到HDFS上的正确目录中,所以hadoop可以找到TransMapper1吗?你也可以尝试从你的类中创建一个 jar 文件并在 hadoop 中运行它
-
@Dean219 我添加了编译和运行代码的方式。你能告诉我应该把编译后的文件放在 HDFS 的什么位置吗,我只是将输入文件移动到了 HDFS。我也应该移动 jar 文件吗?
标签: java hadoop mapreduce mapper