计算文件中的字数，然后将结果写入另一个文件答案

【问题标题】：Counting number of words in a File, then writing the results to another File计算文件中的字数，然后将结果写入另一个文件
【发布时间】：2015-11-25 18:32:53
【问题描述】：

我需要读入一个名为“input.txt”的文件，然后计算该文件中的单词。然后我必须将它们写入另一个名为 output.txt 的文件。

例如：input.txt 中包含“The quick QUICK brown fox”

output.txt 应该如下所示：

2 快

3 棕色

四只狐狸

到目前为止，我有以下代码，但不知道我是否走在正确的道路上。

import java.io.File;
import java.io.FileInputStream;
import java.io.FileNotFoundException;
import java.io.PrintWriter;
import java.util.Scanner;

public class CountWords {
public static void main(String[] args) throws FileNotFoundException {

    File file = new File("input.txt");
    Scanner sc = new Scanner(new FileInputStream(file));
    PrintWriter writer = new PrintWriter("output.txt");
    int count = 0;
    while (sc.hasNext()) {
        String word = sc.next();
        if (word.indexOf("\\") == -1){
            count++;
            writer.printf("%3d",count + " " + word);  //should print ----> |   # word|

        }
        break;
    }
    writer.close(); //close print writer
    sc.close();    //close input file
}

}

【问题讨论】：

您的输出示例与您的描述不符。你说你想统计文件中的单词。但是那里 fox 只出现了一次，而不是 4 次，brown 出现了一次，而不是 3 次，等等。
如果你真的想计算单词并且只是你的例子是错误的，那么在你读完所有输入之前你不能输出任何东西（如果第一个单词和最后一个单词怎么办？例如，输入中的内容相同）。您需要跟踪唯一单词并计算出现次数，因此请查看 Map 接口及其各种实现。

标签： java file-io printwriter

【解决方案1】：

假设一个单词是由空格分隔的字符，您可以轻松地使用以下流 API 来计算文件中的单词数：

 final long count = Files.lines( Paths.get( "myFile.txt") )
                         .flatMap( line -> Stream.of( line.split( "[ ]+" ) ) )
                         .collect( Collectors.counting( ) );

如果您需要按单词重新分区：

final Map<String, Long> collect =
    Files.lines( Paths.get( "myFile.txt" ) )
         .flatMap( line -> Stream.of( line.split( "[ ]+" ) ) )
         .map( String::toLowerCase ) // to count quick and QUICK as same
         .collect( Collectors.groupingBy( Function.identity( ), Collectors.counting( ) ) );

其中地图的键是单词，值是文件中该单词的编号

【讨论】：

使用.flatMap(Pattern.compile("\\s+")::splitAsStream)。这样您就不会为每一行重新编译正则表达式，也不会创建中间数组。

【解决方案2】：

我假设你的文件是一个文本文件，因为你说input.txt。现在，到实际代码：

Scanner scn = new Scanner(file); //Where file is your file
HashMap<String, Integer> words = new HashMap<>(); //Where you will be saving your words
while(scn.hasNext()){
    String str = scn.next().toLowerCase();
    words.put(str, words.containsKey(str)?words.get(str)+1:1);
}

现在您有了 words 映射，其中包含所有单词及其出现次数。要将它们写入另一个文件：

String build = "";
for(String str : words.keySet()){
    if(build!="")build+="\n";
    build+=words.get(str)+" "+str;
}
BufferedWriter bw = new BufferedWriter(new FileWriter(outputfile.getAbsoluteFile()));
bw.write(build);
bw.close();

示例： 假设您有一个包含以下内容的 input.txt：

The quick brown fox JUMPED over jumped the lazy brown DOG QUICK dog

现在，扫描仪将遍历每个单词，因为默认分隔符是 ' '（空格），并为每个单词在地图中的单词索引中添加 1。但是由于我们标记了 word.toLowerCase()，所以 JUMPED 和 jumped 会被同等对待。

之后，我们将输出写入另一个文件，代码运行后将如下所示： 2

2 quick

2 brown

1 fox

2 jumped

1 lazy

2 dog

【讨论】：

我尝试实现此代码以查看它是否可以工作，并且我一直在玩弄它，但仍然无法让它编译。给我带来麻烦的两行如下：“for (String str : word)” ---> will not compile “can only iterate over an array or an instanceof java.lang.iterable” “ BufferedWriter bw = new BufferedWriter (new FileWriter(outputfile.getAbsoluteFile()));" --> 无法编译，因为“输出文件无法解析”
@MaxBolin ... 修复了单词部分，将其更改为键集。你应该把输出文件。