在 Java hdfs 中读取文件

【问题标题】：Reading a file in Java hdfs在 Java hdfs 中读取文件
【发布时间】：2012-10-31 19:25:08
【问题描述】：

我在集群上运行程序时遇到问题，并决定在函数 map 和 reduce 中读取 hdfs 文件。如何逐行读取hdfs文件并烧录到ArrayList中的行？

【问题讨论】：

使用 TextInputFormat 时，默认 InputSplit 是 FileInputSplit，将代表一整行。你到底遇到了什么问题？

【解决方案1】：

只是一个代码sn-p用于演示：

Path path = new Path(filePath);
FileSystem fs = path.getFileSystem(context.getConfiguration()); // context of mapper or reducer
FSDataInputStream fdsis = fs.open(path);
BufferedReader br = new BufferedReader(new InputStreamReader(fdsis));
String line = "";
ArrayList<String> lines = new ArrayList<String>();
while ((line = br.readLine()) != null) {
    lines.add(line);
}
br.close();

【讨论】：