如何阅读整个单词，而不仅仅是第一个字符？答案

【问题标题】：How to read in the entire word, and not just the first character?如何阅读整个单词，而不仅仅是第一个字符？
【发布时间】：2019-11-05 15:01:15
【问题描述】：

我正在用 C 语言编写一个方法，其中我有一个文件中的单词列表，这些单词来自我从标准输入重定向的文件。但是，当我尝试将单词读入数组时，我的代码只会输出第一个字符。我知道这是因为 char 和 char * 的转换问题。

虽然我在挑战自己不要使用 string.h 中的任何函数，但我已经尝试过迭代并正在考虑编写自己的 strcpy 函数，但我很困惑，因为我的输入来自我的文件从标准输入重定向。变量numwords由用户在main方法中输入（未显示）。

我正在尝试通过dumpwptrs 调试此问题，以向我展示输出是什么。我不确定代码中的什么导致我得到错误的输出 - 无论是我如何将单词读入块数组，还是我用 wptrs 错误地指向它？

//A huge chunk of memory that stores the null-terminated words contiguously
char chunk[MEMSIZE]; 

//Points to words that reside inside of chunk
char *wptrs[MAX_WORDS]; 

/** Total number of words in the dictionary */
int numwords;
.
.
.
void readwords()
{
  //Read in words and store them in chunk array
  for (int i = 0; i < numwords; i++) {
    //When you use scanf with '%s', it will read until it hits
    //a whitespace
    scanf("%s", &chunk[i]);
    //Each entry in wptrs array should point to the next word 
    //stored in chunk
    wptrs[i] = &chunk[i]; //Assign address of entry
  }
}

【问题讨论】：

扫描完一个单词后，您希望将i 增加该单词的字符数，可能为该单词的0-终止符增加1。就目前而言，您仅将i 增加，您获得的输出很好地反映了这一点。
@fassn：不，wptrs 很好。
@Inian：不，wptrs 很好。在我看来，每个指针都应该指向数据被扫描到的位置，即chunk。
所有被扫描的单词在哪里？考虑一下。如果需要，可以在一些纸上画出来。将chunk 数组绘制为一长串框，每个框包含一个字符...
.... 就目前而言，您将 i 仅增加 1，您得到的输出很好地反映了这一点。

标签： c arrays string io char

【解决方案1】：

不要重复使用 char chunk[MEMSIZE]; 用于之前的单词。

改为使用下一个未使用的内存。

char chunk[MEMSIZE]; 
char *pool = chunk; // location of unassigned memory pool

    // scanf("%s", &chunk[i]);
    // wptrs[i] = &chunk[i];
    scanf("%s", pool);
    wptrs[i] = pool;
    pool += strlen(pool) + 1;  // Beginning of next unassigned memory

健壮的代码会检查scanf() 的返回值并确保i, chunk 不超过限制。

我会选择fgets() 解决方案，只要单词一次输入一行。

char chunk[MEMSIZE]; 
char *pool = chunk;

// return word count
int readwords2() {
  int word_count;
  // limit words to MAX_WORDS
  for (word_count = 0; word_count < MAX_WORDS; word_count++) {
    intptr_t remaining = &chunk[MEMSIZE] - pool;
    if (remaining < 2) {
      break; // out of useful pool memory
    }
    if (fgets(pool, remaining, stdin) == NULL) {
      break; // end-of-file/error
    }
    pool[strcspn(pool, "\n")] = '\0'; // lop off potential \n
    wptrs[word_count] = pool;
    pool += strlen(pool) + 1;
  }
  return word_count;
}

【讨论】：

@alk 这很有帮助，谢谢。 dumpwptrs 正在打印正确的输出！

【解决方案2】：

虽然我在挑战自己不要使用 string.h 中的任何函数，但...

挑战自己不使用string.h 中的任何函数的最佳方法是自己编写它们然后使用它们。

您的程序读取缓冲区chunk 的i-esim 位置中的下一个单词，因此您将获得每个单词的首字母（只要i 的大小不超过@ 987654326@) 每次阅读时，都会用刚刚阅读的字符覆盖最后一个单词的第二个和其余字符。然后，您将所有指针放在wptrs 中以指向这些位置，从而无法区分一个字符串的结尾与下一个字符串（您覆盖了所有空终止符，只留下最后一个），因此您将获得第一个包含您单词的所有第一个字母但最后一个字母的字符串，这是完整的。那么第二个将具有相同的字符串，但从第二个开始......然后是第三个......等等。

构建您自己的strdup(3) 版本并使用chunk 临时存储字符串...然后使用您的strdup(3) 版本动态分配字符串的副本并使指针指向它。 ..等等

最后，完成后，释放所有分配的字符串，瞧！！

另外，这很重要：阅读How to create a Minimal, Complete, and Verifiable example，因为您的代码经常缺少一些您已从发布的代码中消除的错误（您通常不知道错误在哪里，或者您会更正了它，这里没有问题，对吧？）

【讨论】：