如何使用 BufferedReader 一次又一次地读取相同的 txt 文件？答案

【问题标题】：How to read same txt file again and again using BufferedReader?如何使用 BufferedReader 一次又一次地读取相同的 txt 文件？
【发布时间】：2018-10-18 10:37:34
【问题描述】：

    String fileName="words.txt"; //words.txt file contains 25,000 words
    String word;

    try {

    FileReader fileReader=new FileReader(fileName);
    BufferedReader bufferReader;

    ArrayList<String> arrBag;

    int count;
    bufferReader=new BufferedReader(fileReader);

    for (int i=1;i<=maxWordLength;i++)  //maxWordLength is 22
    {
        arrBag = new ArrayList<String> (); // arrBag contains all words with same length and then store to hash map.

        count=0;

        bufferReader.mark(0);               
        while((word=bufferReader.readLine())!=null)
        {
            if (word.length()==i)
            {
                arrBag.add(word);
                count++;
            }
        }

        System.out.println("HashMap key : "+i+" has bag count : "+count);
        mapBagOfTasks.put(Integer.toString(i), arrBag);  //mapBagOfTasks is HashMap where key is length of word and value is ArrayList of words with same length.   

        bufferReader.reset();

    }
    if (fileReader!=null)
    {
        fileReader.close();
    }



}
catch (FileNotFoundException e) {
    System.out.println("Input file not found");
    e.printStackTrace();
}
catch (IOException e) {
    System.out.println("Error while reading File '"+fileName+"'");
    e.printStackTrace();
}

我有一个包含 25,000 个单词的“words.txt”文件。我想将所有具有相同长度的单词存储到一个 ArrayList 中，然后将其存储到 Hash 映射中作为键：单词的长度和值是数组列表。

我面临的问题是我的程序第一次读取文件但没有再次读取相同的文件。我尝试使用 mark() 和 reset() 函数，但再次面临同样的问题。你可以看到输出的理由。我该如何解决这个问题？

我的程序输出是：文件中的最大字长：22
HashMap key : 1 has bag count : 26 //（表示找到第 1 个单词的 26 个单词）
HashMap 键：2 有袋数：0
HashMap 键：3 有袋数：0
HashMap 键：4 有袋数：0
HashMap 键：5 有袋数：0
HashMap 键：6 有袋数：0
HashMap 键：7 有袋数：0
HashMap 键：8 有袋数：0
HashMap 键：9 有袋数：0
HashMap 键：10 有袋数：0
HashMap 键：11 有袋数：0
HashMap 键：12 有袋数：0
HashMap 键：13 有袋数：0
HashMap 键：14 有袋数：0
HashMap 键：15 有袋数：0
HashMap 键：16 有袋数：0
HashMap 键：17 有袋数：0
HashMap 键：18 有袋数：0
HashMap 键：19 有袋数：0
HashMap 键：20 有袋数：0
HashMap 键：21 有袋数：0
HashMap 键：22 包数：0

【问题讨论】：

为什么应该重读一遍？
如果您在该文件中发现了肺炎超显微镜硅火山灰怎么办？
更严重的问题，如果您有重复的单词会发生什么...您是否希望 arrBag 只包含唯一的单词？
我同意充满鳗鱼的气垫船。您可以简单地收集所有单词（示例中的行），只需读取一次文件（您可以使用 List 来保留重复项并保持相同的输入顺序），然后应用所需的任何业务逻辑。
有趣。我会喜欢这个答案的。从 SSD 读取并没有那么慢，而且我有几百万行，所以我不想使用解决方法，而是想再次转到第一行。

标签： java eclipse hashmap bufferedreader readfile

【解决方案1】：

相对于处理内存中的数据，从磁盘读取是一项昂贵的操作，因此您应该只读取一次文件。我建议你这样做：

    Map<Integer, List<String>> lengthToWords = new HashMap<>();
    while ((word = bufferReader.readLine()) != null) {
        int length = word.length();
        if (length < maxWordLength) {
            if (!lengthToWords.containsKey( length ))
                lengthToWords.put( length, new ArrayList<>() );
            lengthToWords.get( length ).add( word );
        }
    }

【讨论】：

【解决方案2】：

IO 通常是任何程序中最慢的部分。除非您处理的文件大于可用内存的数量，否则您应该将整个文件读入内存一次，然后在那里对其进行操作。根据您对您正在尝试执行的操作的描述，这是我将编写的代码。

import java.io.BufferedReader;
import java.io.FileReader;
import java.io.IOException;
import java.util.ArrayList;
import java.util.HashMap;

public class WordLength {

    public static void main(String[] args) {

        String fileName = "words.txt";
        int maxWordLength = 0;

        HashMap<Integer, ArrayList<String>> mapBagOfTasks = new HashMap<>();

        // populate the HashMap
        try {
            BufferedReader br = new BufferedReader(new FileReader(fileName));

            String word = "";
            while ((word=br.readLine())!=null) {

                int count = word.length();
                if (count>maxWordLength) {
                    maxWordLength = count;
                }

                // if an array list for words of length count is not in the map. put in a new one
                if (!mapBagOfTasks.containsKey(count)) {
                    mapBagOfTasks.put(count, new ArrayList<>());
                }

                // get the array list for words of length count
                ArrayList<String> arrBag = mapBagOfTasks.get(count);

                // add word to that array list
                arrBag.add(word);

            }

            br.close();
        } catch (IOException e) {
            e.printStackTrace();
        }

        // loop over all of the keys and their values
        for (int key=0; key<maxWordLength; key++) {
            if (mapBagOfTasks.containsKey(key)) {
                ArrayList<String> value = mapBagOfTasks.get(key);

                System.out.println("HashMap key : "+key+" has bag count "+value.size());
            } else {
                System.out.println("HashMap key : "+key+" has bag count 0");
            }
        }
    }
}

【讨论】：

基本指令（contains, get, add, loop）不用放cmets解释。理想情况下，一段代码应该有正确的命名，这样就不需要 cmets。