【问题标题】:How to read file and save hyphen using STL C++如何使用 STL C++ 读取文件并保存连字符
【发布时间】:2018-05-10 01:23:13
【问题描述】:

我必须阅读文本文件,将其转换为小写并删除非字母字符,但还需要保存连字符并且不要将其视为一个单词。这是我的编码。它将连字符计为 UnknownWords 中的单词。我只想保存连字符,只想计算.txt 中连字符左右两侧的单词。

我的输出:

110 Known words read
79 Unknown words read //it is because it is counting hyphen as word

期望的输出是:

110 Known words read
78 Unknown words read   

代码:

void WordStats::ReadTxtFile(){
    std::ifstream ifile(Filename);
    if(!ifile)
    {
        std::cerr << "Error Opening file " << Filename << std::endl;
        exit(1);
    }
    for (std::string word; ifile >> word; )
    {

        transform (word.begin(), word.end(), word.begin(), ::tolower);
        word.erase(std::remove_if(word.begin(), word.end(), [](char c)
        {
            return (c < 'a' || c > 'z') && c != '\'' && c != '-';
        }),  word.end());
        if (Dictionary.count(word))
        {
            KnownWords[word].push_back(ifile.tellg());
        }
        else
        {
            UnknownWords[word].push_back(ifile.tellg()); 
        }
    }
    //  std::string word; ifile >> word;


    std::cout << KnownWords.size() << " known words read." << std::endl;
    std::cout << UnknownWords.size() << " unknown words read." << std::endl;
}

【问题讨论】:

  • 输入文件是什么样的?
  • 我测试了您的代码,如果word == "this#-is" 将其更改为"this-is",那么我认为它应该可以工作。文件中的连字符周围是否有空格?
  • 它是像文章或段落一样的文本文件,其中包含一些冒号和反逗号和连字符。
  • 是的,连字符周围确实有空格
  • ifile &gt;&gt; word 停在空格处。

标签: c++ vector stl ifstream read-write


【解决方案1】:

如果您不想单独添加一个只是 "-" 的词,请在添加到词向量之前检查它:

for (std::string word; ifile >> word; )
{

    transform (word.begin(), word.end(), word.begin(), ::tolower);
    word.erase(std::remove_if(word.begin(), word.end(), [](char c)
    {
        return (c < 'a' || c > 'z') && c != '\'' && c != '-';
    }),  word.end());
    if (word.find_first_not_of("-") == string::npos) { // Ignore word that's only hyphens
        continue;
    }
    if (Dictionary.count(word))
    {
        KnownWords[word].push_back(ifile.tellg());
    }
    else
    {
        UnknownWords[word].push_back(ifile.tellg()); 
    }
}

【讨论】:

    猜你喜欢
    • 1970-01-01
    • 2010-09-28
    • 1970-01-01
    • 1970-01-01
    • 2011-02-06
    • 1970-01-01
    • 1970-01-01
    • 2017-02-18
    • 1970-01-01
    相关资源
    最近更新 更多