【问题标题】:Reading an input file line by line using string stream使用字符串流逐行读取输入文件
【发布时间】:2017-03-23 17:09:10
【问题描述】:

我有一个数据文件“records.txt”,格式如下:

2 100 119 107 89 125 112 121 99 124 126 123 103 128 77 85 86 115 66 117 106 75 74 76 96 93 73 109 127 110 67 65 80 
1 8 5 23 19 2 36 13 16 24 59 15 22 48 49 57 46 47 27 51 6 30 7 31 41 17 43 53 34 37 42 61 54 
2 70 122 81 83 72 82 105 88 95 108 94 114 98 102 71 104 68 113 78 120 84 97 92 116 101 90 111 91 69 118 87 79 
1 35 14 12 52 58 56 38 45 26 32 39 9 21 11 40 55 50 44 18 20 63 10 60 28 1 64 4 33 3 25 62 29 

每行以一个或两个开头,表示它属于哪个批次。我正在尝试使用字符串流读取每一行并将结果存储在一个结构中,第一个数字对应于批号,后面的 32 个整数对应于内容,属于结构向量。我一直在为此苦苦挣扎,我遵循了此处找到的解决方案:How to read line by line

生成的程序如下:

#include <iostream>
#include <fstream>
#include <sstream>
#include <string>
#include <vector>

using namespace std;

const string record1("records.txt");

// declaring a struct for each record
struct record
{
    int number;             // number of record
    vector<int> content;    // content of record    
};

int main()
{
    record batch_1;         // stores integers from 1 - 64
    record batch_2;         // stores integers from 65 - 128
    record temp;
    string line;

    // read the data file
    ifstream read_record1(record1.c_str());
    if (read_record1.fail()) 
    {
        cerr << "Cannot open " << record1 << endl;
        exit(EXIT_FAILURE);
    } 
    else
        cout << "Reading data file: " << record1 << endl;

    cout << "Starting Batch 1..." << endl;
    read_record1.open(record1.c_str());
    while(getline(read_record1, line))
    {       
        stringstream S;
        S << line;              // store the line just read into the string stream
        vector<int> thisLine;   // save the numbers read into a vector
        for (int c = 0; c < 33; c++)    // WE KNOW THERE WILL BE 33 ENTRIES
        {
            S >> thisLine[c];
            cout << thisLine[c] << " ";
        }
        for (int d = 0; d < thisLine.size(); d++) 
        {
            if (d == 0)
                temp.number = thisLine[d];
            else
                temp.content.push_back(thisLine[d]);
            cout << temp.content[d] << " ";
        }   

        if (temp.number == 1) 
        {
            batch_1.content = temp.content;
            temp.content.clear();
        }

        thisLine.clear();
    }

    // DUPLICATE ABOVE FOR BATCH TWO

    return 0;
}   

程序编译并运行,返回值为 0,但循环中的 cout 语句不执行,因为唯一的控制台输出是:

Starting Batch 1...

此外,如果第二批的代码重复,我会遇到分段错误。很明显,这不能正常工作。我不精通阅读字符串,因此将不胜感激。另外,如果这些行没有相同数量的条目(例如,一行有 33 个条目,另一行有 15 个),我该怎么办?

【问题讨论】:

  • 这里 S &gt;&gt; thisLine[c];thisline[c] 不存在 - 您需要使用 push_back 成员函数向向量添加元素。
  • 或者至少用vector&lt;int&gt; thisLine(33,0)初始化它
  • 我已经尝试过了,结果是否定的。刚刚又做了一次,我仍然得到与上面相同的结果....?
  • 尝试将其存储到单独的 int 中,然后将其推送到向量中。 int temp; / S &gt;&gt; temp; / thisline.push_back(temp);
  • 这里是另一个问题:if (d == 0) temp.number = thisLine[d]; else temp.content.push_back(thisLine[d]); cout &lt;&lt; temp.content[d] &lt;&lt; " ";d 为0 时,temp.content 可能仍为空,因此访问temp.content[d] 可能会失败。如果你改用 temp.content.at(d),你会得到一个 std::out_of_range 异常。

标签: c++ stringstream read-write


【解决方案1】:

你的代码有很多问题:

  1. 您打开了两次输入文件。没什么大不了的,但也不是可取的。如果您将文件名传递给std::ifstream 构造函数,它会立即打开文件,因此无需在之后调用open()

  2. while 循环的第一个for 循环中,您尝试使用operator&gt;&gt; 将整数直接读入本地thisLine 向量,但这将无法正常工作,因为您没有为thisLine 的数组还没有。由于您期望 33 个整数,因此您可以在读取之前预先分配数组:

    vector<int> thisLine(33);
    

    或者:

    vector<int> thisLine;
    thisLine.resize(33);
    

    但是,由于您还询问了具有不同整数数量的单独行的可能性,因此您根本不应该预先调整向量的大小,因为您还不知道整数的数量(尽管您可以预先分配如果您知道可能期望的最大整数数,则向量的容量)。您可以使用while 循环而不是for 循环,这样您就可以阅读整个std::stringstream,而不管它实际包含多少个整数:

    thisLine.reserve(33); // optional
    
    int c;
    while (S >> c) {
        thisLine.push_back(c);
    }
    
  3. 在第二个for 循环中,您正在访问temp.content[d],但如果d 为0,则temp.content 可能尚未被填充,因此访问temp.content[0] 将不起作用(并且您是否使用过@ 987654342@ 相反,你会得到一个 std::out_of_range 异常)。你可能打算做更多这样的事情:

    for (int d = 0; d < thisLine.size(); d++) 
    {
        if (d == 0)
            temp.number = thisLine[d];
        else {
            temp.content.push_back(thisLine[d]);
            cout << thisLine[d] << " ";
        }
    }   
    

    但即使这样也可以通过完全删除 push_back() 循环来简化:

    if (thisLine.size() > 0)
    {
        temp.number = thisLine[0];
        thisLine.erase(thisLine.begin());
    }
    temp.content = thisLine;
    
    for (int d = 0; d < thisLine.size(); d++) 
        cout << thisLine[d] << " ";
    
  4. 您正在遍历整个文件一次,读取所有记录但只处理第 1 批记录。你说你有一组重复的循环来处理第 2 批记录。这意味着您将再次重新读取整个文件,重新读取所有记录,但忽略第 1 批记录。这是很多浪费的开销。您应该读取一次文件,根据需要将批次分开,然后您可以在读取循环完成时对其进行处理,例如:

    vector<record> batch_1;         // stores integers from 1 - 64
    vector<record> batch_2;         // stores integers from 65 - 128
    record temp;
    
    ...
    
    while(getline(read_record1, line))
    {       
        ...
        if (temp.number == 1) {
            batch_1.push_back(temp);
        } else {
            batch_2.push_back(temp);
        }
    }
    
    // process batch_1 and batch_2 as needed...
    

因此,话虽如此,更正后的代码应该看起来更像这样:

#include <iostream>
#include <fstream>
#include <sstream>
#include <string>
#include <vector>

using namespace std;

const string records_file("records.txt");

// declaring a struct for each record
struct record
{
    int number;             // number of record
    vector<int> content;    // content of record    
};

int main()
{
    vector<record> batch_1;         // stores integers from 1 - 64
    vector<record> batch_2;         // stores integers from 65 - 128
    record temp;
    string line;

    // read the data file
    ifstream read_records(records_file.c_str());
    if (read_records.fail()) 
    {
        cerr << "Cannot open " << records_file << endl;
        exit(EXIT_FAILURE);
    } 

    cout << "Reading data file: " << records_file << endl;

    cout << "Starting Batch 1..." << endl;

    while (getline(read_records, line))
    {       
        istringstream S(line);  // store the line just read into the string stream
        vector<int> thisLine;   // save the numbers read into a vector
        thisLine.reserve(33);   // WE KNOW THERE WILL BE 33 ENTRIES

        int c;
        while (S >> c) {
            thisLine.push_back(c);
            cout << c << " ";
        }

        temp.number = 0;
        temp.content.reserve(thisLine.size());

        for (int d = 0; d < thisLine.size(); d++) 
        {
            if (d == 0)
                temp.number = thisLine[d];
            else
                temp.content.push_back(thisLine[d]);
        }   

        /* alternatively:
        if (thisLine.size() > 0) {
            temp.number = thisLine[0];
            thisLine.erase(thisLine.begin());
        }
        temp.content = thisLine;
        */

        if (temp.number == 1) {
            batch_1.push_back(temp);
        }

        temp.content.clear();
    }

    read_records.seekg(0);

    cout << "Starting Batch 2..." << endl;

    // DUPLICATE ABOVE FOR BATCH TWO

    read_records.close();

    // process batch_1 qand batch_2 as needed...

    return 0;
}

然后您可以通过完全摆脱thisLine 向量来稍微简化您的阅读循环:

#include <iostream>
#include <fstream>
#include <sstream>
#include <string>
#include <vector>

using namespace std;

const string records_file("records.txt");

// declaring a struct for each record
struct record
{
    int number;             // number of record
    vector<int> content;    // content of record    
};

int main()
{
    vector<record> batch_1;         // stores integers from 1 - 64
    vector<record> batch_2;         // stores integers from 65 - 128
    record temp;
    string line;

    // read the data file
    ifstream read_records(records_file.c_str());
    if (read_records.fail()) 
    {
        cerr << "Cannot open " << records_file << endl;
        exit(EXIT_FAILURE);
    } 

    cout << "Reading data file: " << records_file << endl;

    cout << "Starting Batch 1..." << endl;

    while (getline(read_records, line))
    {       
        istringstream S(line);  // store the line just read into the string stream
        if (S >> temp.number)
        {
            cout << temp.number << " ";

            temp.content.reserve(32);   // WE KNOW THERE WILL BE 32 ENTRIES

            int c;
            while (S >> c) {
                temp.content.push_back(c);
                cout << c << " ";
            }

            if (temp.number == 1) {
                batch_1.push_back(temp);
            }

            temp.content.clear();
        }
    }

    read_records.seekg(0);

    cout << "Starting Batch 2..." << endl;

    // DUPLICATE ABOVE FOR BATCH TWO

    read_records.close();

    // process batch_1 qand batch_2 as needed...

    return 0;
}

然后,如果您愿意,可以通过使用std::copy()std::istream_iteratorstd::back_insertor 来进一步简化代码:

#include <iostream>
#include <fstream>
#include <sstream>
#include <string>
#include <vector>
#include <iterator>
#include <algorithm>

using namespace std;

const string records_file("records.txt");

// declaring a struct for each record
struct record
{
    int number;             // number of record
    vector<int> content;    // content of record    
};

// declaring an input operator to read a single record from a stream
istream& operator>>(istream &in, record &out)
{
    out.number = 0;
    out.content.clear();

    string line;
    if (getline(in, line))
    {
        istringstream iss(line);
        if (iss >> out.number) {
            cout << out.number << " ";

            out.content.reserve(32); // WE KNOW THERE WILL BE 32 ENTRIES
            copy(istream_iterator<int>(iss), istream_iterator<int>(), back_inserter(out.content));

            for (int d = 0; d < out.content.size(); d++) 
                cout << out.content[d] << " ";
        }
    }

    return in;
}

int main()
{
    vector<record> batch_1;         // stores integers from 1 - 64
    vector<record> batch_2;         // stores integers from 65 - 128
    record temp;

    // read the data file
    ifstream read_records(records_file.c_str());
    if (!read_records) 
    {
        cerr << "Cannot open " << records_file << endl;
        exit(EXIT_FAILURE);
    } 

    cout << "Reading data file: " << records_file << endl;

    while (read_records >> temp)
    {       
        switch (temp.number)
        {
            case 1:
                batch_1.push_back(temp);
                break;

            case 2:
                batch_2.push_back(temp);
                break;
        }
    }

    read_records.close();

    // process batch_1 and batch_2 as needed...

    return 0;
}

【讨论】:

  • Remy,非常感谢您非常彻底的回复。我用你的代码做了一个测试,它确实有效,但这需要一些时间让我消化!
猜你喜欢
  • 2018-01-08
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 2017-03-31
  • 2010-11-08
  • 1970-01-01
  • 2012-11-02
相关资源
最近更新 更多