【问题标题】:How does std::getline decides to skip last empty line?std::getline 如何决定跳过最后一个空行?
【发布时间】:2018-01-18 11:43:10
【问题描述】:

我在逐行读取文件时注意到一些奇怪的行为。如果文件以\n(空行)结尾,它可能会被跳过……但并非总是如此,我看不出是什么让它被跳过。

我编写了这个小函数,将字符串分成几行以轻松重现问题:

std::vector<std::string> SplitLines( const std::string& inputStr )
{
    std::vector<std::string> lines;

    std::stringstream str;
    str << inputStr;

    std::string sContent;
    while ( std::getline( str, sContent ) )
    {
        lines.push_back( sContent );
    }

    return lines;
}

当我测试它 (http://cpp.sh/72dgw) 时,我得到了这些输出:

(1) "a\nb"       was splitted to 2 line(s):"a" "b" 
(2) "a"          was splitted to 1 line(s):"a" 
(3) ""           was splitted to 0 line(s):
(4) "\n"         was splitted to 1 line(s):"" 
(5) "\n\n"       was splitted to 2 line(s):"" "" 
(6) "\nb\n"      was splitted to 2 line(s):"" "b" 
(7) "a\nb\n"     was splitted to 2 line(s):"a" "b" 
(8) "a\nb\n\n"   was splitted to 3 line(s):"a" "b" ""

所以最后一个\n 被跳过(6)、(7)和(8),很好。但是为什么不是(4)和(5)呢?

这种行为背后的原因是什么?

【问题讨论】:

    标签: c++ string getline eol


    【解决方案1】:

    有一篇有趣的帖子很快提到了这种“奇怪”的行为:getline() sets failbit and skips last line

    正如Rob's answer 所提到的,\n 是一个终止符(这实际上就是它的名称End Of Line),而不是一个分隔符,这意味着行被定义为“以'\n'结尾”,而不是“由'\n'分隔”。

    我不清楚这是如何回答问题的,但确实如此。改写如下,它变得清澈如水:

    如果您的内容计数 x 出现 '\n',那么您将得到 x 行,或者如果末尾有一些额外的非 '\n' 字符,则为 x+1文件。

    (1) "a\nb"       splitted to 2 line(s):"a" "b"    (1 EOL + extra characters = 2 lines)
    (2) "a"          splitted to 1 line(s):"a"        (0 EOL + extra characters = 1 line)
    (3) ""           splitted to 0 line(s):           (0 EOL + no extra characters = 0 line)
    (4) "\n"         splitted to 1 line(s):""         (1 EOL + no extra characters = 1 line) 
    (5) "\n\n"       splitted to 2 line(s):"" ""      (2 EOL + no extra characters = 2 lines)
    (6) "\nb\n"      splitted to 2 line(s):"" "b"     (2 EOL + no extra characters = 2 lines)
    (7) "a\nb\n"     splitted to 2 line(s):"a" "b"    (2 EOL + no extra characters = 2 lines)
    (8) "a\nb\n\n"   splitted to 3 line(s):"a" "b" "" (3 EOL + no extra characters = 3 lines)
    

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 2019-07-09
      • 1970-01-01
      • 2017-02-13
      • 2015-06-24
      • 1970-01-01
      • 2015-09-10
      • 1970-01-01
      • 2014-09-24
      相关资源
      最近更新 更多