在换行符后停止读取文件答案

【问题标题】：File stops being read after a newline character在换行符后停止读取文件
【发布时间】：2014-09-25 17:39:53
【问题描述】：

我正在尝试创建垃圾邮件过滤器。我需要先训练模型。我从一个文本文件中读取单词，其中单词“spam”或“ham”作为段落的第一个单词，然后是邮件中的单词及其在单词之后出现的次数。文件中有段落。我的程序能够读取第一段，即单词及其出现次数。

问题在于，文件在遇到换行符后停止读取并且不读取下一段。虽然我有一种感觉，我检查作为段落结尾的换行符的方式并不完全正确。

我已经给出了两段，所以你只是了解火车文本的想法。训练文本文件。

/000/003 火腿需要 1 fw 1 35 2 39 1 感谢 1 线程 2 40 1 副本 1 else 1 相关器 1 under 1 company 1 25 1 he 2 26 2 168 1 29 2 内容 4 1 1 6 1 5 1 4 1 评论 2 我们 1 约翰 3 17 1 使用 1 15 1 20 1 类 1 可能 1 a 1 回 1 l 1 01 1 生产 1 i 1 是 1 10 2 713 2 v6 1 p 1 原版 2

/000/031 ham 唐 1 金 5 戴夫 1 39 1 客户 1 38 2 感谢 1 超过 1 线程 2 年 1 相关器 1 低于 1 威廉姆斯 1 星期一 2编号 2 厨房 1 168 1 29 1 内容 4 3 2 2 6 系统 2 1 2 7 1 6 1 5 2 4 1 9 1 每个 1 8 1 视图 2

#include <iostream>
#include <fstream>
#include <string>
using namespace std;

int main()
{
    int V = 0; // Total number of words

    ifstream fin;
    fin.open("train", ios::in);
    string word;
    int wordnum;
    int N[2] = {0};
    char c, skip;
    for (int i = 0; i < 8; i++) fin >> skip; // There are 8 characters before the first word of the paragraph
    while (!fin.fail())
    {
        fin >> word;
        if (word == "spam") N[0]++;
        else if (word == "ham") N[1]++;
        else
        {
            V++;
            fin >> wordnum;
        }
        int p = fin.tellg();
        fin >> c; //To check for newline. If its there, we skip the first eight characters of the new paragraph because those characters aren't supposed to be read
        if (c == '\n')
        {
            for (int i = 0; i < 8; i++) fin >> skip;
        }
        else fin.seekg(p);
    }

    cout << "\nSpam: " << N[0];
    cout << "\nHam :" << N[1];
    cout << "\nVocab: " << V;

    fin.close();

    return 0;
}

【问题讨论】：

while(!fin.fail()) { /* ... */ } 可能不会比here 中所说的好多少。虽然这里是some starter，但可以为您提供一些技巧，这可能有助于解决您的问题。
您能否提供一个示例train 文件（通过编辑您的问题）？
天哪谁在教大家这些不正确的使用流的方法？！
考虑使用istream::peek (cplusplus.com/reference/istream/istream/peek) 而不是你正在做的tellg() seekg() 组合。
@MadPhysicist “谁投了赞成票” 是我，我仔细阅读了这个问题。 MCVE 政策未得到满足，示例性输入丢失。

标签： c++ file newline

【解决方案1】：

std::ifstream::operator>>() 不读取变量中的\n；它掉了它。如果您需要使用空格和\n 符号进行操作，可以使用std::ifstream::get()

【讨论】：

你的英语很好。感谢您的回答。
您也可以读取 std::string 中的所有行（通过 std::getline()），然后按单词拆分。