如何在 C txt 文件中读取一行中的单词数答案

【问题标题】：How to read number of words in a line in C txt file如何在 C txt 文件中读取一行中的单词数
【发布时间】：2021-03-10 20:34:59
【问题描述】：

您好，我有一个像素数组，我使用 fprintf 将其写入文本文件。我试图获取行数和列数，但我注意到 fscanf 没有考虑换行，所以当我使用它时，我只能得到数字的总数。还有其他方法可以获取行数和列数吗？

100 255 244 200
999  11  23  41
234   0  23 111

【问题讨论】：

检查getline这可能会有所帮助，here
对于行数-fgets，对于列数，可能 strtok
你能说出最多可能的列吗？
您的两行以空白结尾，最后一行没有。这是输入的灵活性还是您能保证所有内容都以一个空白结尾，并且只有最后一行没有结尾？
这可能是一个单行，具有严格的输入格式。请提供您打算在其中存储读取值的数据结构。填充它可能需要另一行......

标签： c file scanf

【解决方案1】：

行数等于\n 的数量加一（如果最后一个字节不是\n）。列数等于单个空格数加一（不计算列的额外空间）。我读取了文件的全部内容并将其存储到一个字符数组中，然后我计算了\n 和空格的数量，以找到行数和列数。

#include <stdio.h>
#include <unistd.h>
#include <stdlib.h>
#include <string.h>
#include <fcntl.h>
#include <errno.h>

int main()
{
    int fd;
    char arr[1024]; // Increase size according to the file size.
    ssize_t count;
    int isLineReach = 0;
    int row = 0, cols =0;
    fd = open("RowCol.txt", O_RDONLY);
    if (fd >= 0)
    {
        count = read(fd, arr, 1024); // reading from file and writing to array.
        if (count == -1)
        {
            printf("Error %d\n", errno);
            exit(0);
        }
        arr[count] = '\0';
    }
    
   else
    {
       printf("Error opening a file %d\n", errno);
       exit(0);
    }

    for(int i=0; i<count-1; i++) // count--> Total number of characters including `\n` and `' '` space
    {
        if(arr[i] == '\n') // Checking for number of lines
        {
            row++;
            isLineReach = 1;
        }

        else if(arr[i] == ' ' && isLineReach != 1 && arr[i+1] !=' ') // Checking for number of spaces in a line
        {
            cols++;
        }
        
    }

    printf("The number of rows are %d\n",(row+1));
    printf("The number of columns are %d\n",(cols+1));

    return 0;
    close(fd);
}

RowCol.txt 内容：

100 255 244 200
999  11  23  41
234   0  23 111
123 234 123 0
112 230 43 12
123 234 43 10
133 321 23 12
100 123 67 89
102 34 45 67
104 123 43 54
120 165 23 23

输出是：

The number of rows are 11
The number of columns are 4

【讨论】：

标准库完全可以做到这一点，为什么还要使用非标准函数？
如果文件大于 1024 字节怎么办？还有，这算空格数来标识列数，如果第一行是234 0 23 111
arr[i + 1] 可能读取条件i < count 越界，将条件更改为i < count - 1
行数等于\n数加一。如果最后一个字节是换行符，为什么要加一？
@KrishnaKanthYenumula：好的。对于列数，请注意初始空格和尾随空格：您的方法在这些上失败，将空格 1 空格换行符计为 3 列。

【解决方案2】：

对于行，fgets 足以逐行读取

size_t rows = 0;
char buffer[256] = {0};
FILE* f = fopen("test.txt", "r");
if (!f)
{
    fprintf(stderr, "Could not open file\n");
    return 1;
}
while (fgets(buffer, sizeof buffer, f) != NULL)
{
    if (strchr(buffer, '\n') != NULL)
    {
        // Increment row counter if a newline is present in the string
        rows++;
    }
    else if (feof(f))
    {
        // Increment even if there's no newline but EOF has been reached
        rows++;
    }
}
fclose(f);
printf("rows: %d\n", rows);
return 0;

fgets 最多会读入sizeof(buffer) - 1 个字符或直到遇到的第一个换行符，以先到者为准。

这意味着某些行大于缓冲区大小（在本例中为 256）的读取，将不会读取整行。所以我们需要在递增之前检查字符串中是否真的存在换行符strchr。

对于列，假设所有行的列数相同，您可以简单地计算包含整行的buffer 中的空格数（不连续）

size_t columns = 0;
char buffer[256] = {0};
FILE* f = fopen("test.txt", "r");
if (!f)
{
    fprintf(stderr, "Could not open file\n");
    return 1;
}
// Read the first line in full, keep trying until a newline is encountered
while (strchr(buffer, '\n') == NULL && fgets(buffer, sizeof buffer, f) != NULL)
{
    // Keep track of whether or not actual column data has been encountered
    bool data_encountered = false;
    for (size_t i = 0; i < strlen(buffer) - 1; i++)
    {
        if (buffer[i] != ' ')
        {
            // NOTE: This assumes any non space character is valid column data
            data_encountered = true;
        }
        else if (data_encountered)
        {
            // Encountered space, if column data had been encountered prior - increment count
            columns++;
            // Reset data_encountered
            data_encountered = false;
        }
    }
}
// Increment columns one last time if line ended with a non space character
size_t bufferlen = strlen(buffer);
if (buffer[bufferlen - 1] == '\n')
{
    // Buffer ended in a newline, check the character just before it
    // Increment column count if the last character (excluding newline is a valid column data)
    columns += (buffer[bufferlen - 2] != ' ');
}
else
{
    // Increment column count if the last character (excluding newline is a valid column data)
    columns += (buffer[bufferlen - 1] != ' ');
}
fclose(f);
printf("columns: %d\n", columns);
return 0;

循环一直调用fgets，直到缓冲区中出现换行符，即已读取一行。在循环内部，对于每个缓冲区，将空格数（非连续）添加到计数器中，表示列。

如果你事先知道列数的上限，甚至每行字符数的上限 -您将不需要所有这些保护措施。但在您无法猜测的情况下，这将是可靠的。

现在，你如何组合它们？我建议将它们放在单独的函数中，一个用于计算行数，另一个用于计算列数。不用担心性能，如果编译器看到这两个函数在彼此附近被调用，它会处理这个问题。

但是如果你坚持在同一个函数中完成所有这些，这里有一个有效的实现-

int columns = 0, rows = 0;
char buffer[256] = { 0 };
FILE* f = fopen("test.txt", "r");
if (!f)
{
    fprintf(stderr, "Could not open file\n");
    return 1;
}
// Extract the first line and count the columns
while (strchr(buffer, '\n') == NULL && fgets(buffer, sizeof buffer, f) != NULL)
{
    // Keep track of whether or not actual column data has been encountered
    bool data_encountered = false;
    for (size_t i = 0; i < strlen(buffer) - 1; i++)
    {
        if (buffer[i] != ' ')
        {
            // NOTE: This assumes any non space character is valid column data
            data_encountered = true;
        }
        else if (data_encountered)
        {
            // Encountered space, if column data had been encountered prior - increment count
            columns++;
            // Reset data_encountered
            data_encountered = false;
        }
    }
}
// Increment columns one last time if line ended with a non space character
size_t bufferlen = strlen(buffer);
if (buffer[bufferlen - 1] == '\n')
{
    // Buffer ended in a newline, check the character just before it
    // Increment column count if the last character (excluding newline is a valid column data)
    columns += (buffer[bufferlen - 2] != ' ');
}
else
{
    // Increment column count if the last character (excluding newline is a valid column data)
    columns += (buffer[bufferlen - 1] != ' ');
}
// Increment rows by one, since one line has been read already
rows++;
// Reset all cells in the buffer to 0
memset(buffer, 0, sizeof buffer);
// Count the rest of the lines
while (fgets(buffer, sizeof buffer, f) != NULL)
{
    if (strchr(buffer, '\n'))
    {
        rows++;
    }
    else if (feof(f))
    {
        rows++;
    }
}
fclose(f);
printf("rows: %d\n", rows);
printf("columns: %d\n", columns);

注意：要包含在代码中的标头-

#include <stdio.h>
#include <string.h>
#include <stdbool.h>

【讨论】：

一个由 2 个字节组成的文件：一个空格和 1，将被分析为 rows: 0 和 cols: 2，这在这两个方面似乎都不正确。
@chqrlie 好吧，我提到了这种边缘情况并想给 OP 留下一些乐趣，但我肯定会编辑完整的解决方案来处理所有情况