将数据从 HBase 迁移到文件系统。（将 Reducer 输出写入本地或 Hadoop 文件系统）答案

【问题标题】：Migrating Data from HBase to FileSystem. (Writing Reducer output to Local or Hadoop filesystem)将数据从 HBase 迁移到文件系统。（将 Reducer 输出写入本地或 Hadoop 文件系统）
【发布时间】：2011-12-08 10:53:58
【问题描述】：

我的目的是将数据从 Hbase 表迁移到平面（比如 csv 格式）文件。我习惯了 TableMapReduceUtil.initTableMapperJob(tableName, scan, GetCustomerAccountsMapper.class、Text.class、Result.class、工作）; 用于扫描 HBase 表和用于 Mapper 的 TableMapper。我的挑战在于强制 Reducer 将 Row 值（以扁平格式标准化）转储到本地（或 Hdfs）文件系统。我的问题是我既看不到 Reducer 的日志，也看不到我在 Reducer 中提到的路径中的任何文件。

这是我的第二或第三份 MR 工作，也是第一份严肃的工作。经过两天的努力，我仍然不知道如何实现我的目标。

如果有人能指出正确的方向，那就太好了。

这是我的减速器代码 -

public void reduce(Text key, Iterable<Result> rows, Context context)
            throws IOException, InterruptedException {
FileSystem fs = LocalFileSystem.getLocal(new Configuration());
   Path dir = new Path("/data/HBaseDataMigration/" + tableName+"_Reducer" + "/" +        key.toString());

FSDataOutputStream fsOut = fs.create(dir,true);

for (Result row : rows) {
 try {
 String normRow = NormalizeHBaserow(
 Bytes.toString(key.getBytes()), row, tableName);
 fsOut.writeBytes(normRow);

//context.write(new Text(key.toString()), new Text(normRow));
  } catch (BadHTableResultException ex) {
    throw new IOException(ex);
}
}
fsOut.flush();          
fsOut.close();

我的减速器输出配置

Path out = new Path(args[0] + "/" + tableName+"Global");
FileOutputFormat.setOutputPath(job, out);

提前致谢 - Panks

【问题讨论】：

标签： hadoop mapreduce hbase

【解决方案1】：

为什么不缩减成 HDFS 并在完成后使用 hdfs fs 导出文件

hadoop fs -get /user/hadoop/file localfile

如果您确实想在 reduce 阶段处理它，请查看 InfoQ 上的 this article on OutputFormat

【讨论】：