程序中计算C中单词、句子和字母数量的问题答案

【问题标题】：Problem in program to count number of words, sentences and letters in C程序中计算C中单词、句子和字母数量的问题
【发布时间】：2021-04-01 01:22:08
【问题描述】：

我需要编写一个程序，通过获取用户的输入来计算单词、句子和字母的数量。该程序运行良好，直到我给出的输入是多行的。如果输入长于终端窗口可以容纳的文本，则程序开始忽略所有句号/问号/感叹号。我不知道为什么，我需要一些帮助。如果文本可以放在终端窗口的一行中，则不会发生这种情况。我还打印了程序读取的每个字符，但这也忽略了所有句号/问号/感叹号。这些字符都没有被打印出来。为了澄清，一个句子只是句号/问号/感叹号的数量，单词数只是文本中的空格数加1。这是我的代码：

#include <stdio.h>
#include <ctype.h> //for the isalpha() function
#include <cs50.h>  //for the get_string() function

int main(void)
{
    int sentences = 0, letters = 0;
    int words = 1;
    char character;
    string text = get_string("Enter Text: \n");

    char x = 0;

    while (text[x] != '\0')
    {
        character = text[x];
        switch (character)
        {
        case ' ':
            words++;
            break;
        case '.':
            sentences++;
            break;
        case '?':
            sentences++;
            break;
        case '!':
            sentences++;
            break;
        default:
            if (isalpha(character))
            {
                letters++;
            }
        }
        x++;
    }

    printf("\n");
    printf("WORDS: %d, LETTERS: %d, SENTENCES: %d\n", words, letters, sentences);
}

我对 c 比较陌生，但我有大约一年的 Python 经验。感谢您的宝贵时间。

【问题讨论】：

请显示准确的测试输入、预期结果和实际结果。
代码看起来不错，所以问题肯定出在其他地方。
请将char x = 0; 更改为int x = 0;，这样它就不会在长行中溢出。
注意你也可以使用isspace()和ispunct()。
仅仅因为终端行在满时溢出到/继续下一行并不意味着您输入了多个“行”。

标签： c cs50

【解决方案1】：

我将提出一些建议。

首先，不要使用get_string¹（或scanf，或fgets）。对于像这样的过滤程序，您实际上不需要存储输入来处理它；使用getchar（或fgetc）一次读取一个字符并基于此循环：

int c;  // getchar returns int, not char
...
puts( "Enter Text:" );
while ( ( c = getchar() ) != EOF )
{
  // test c instead of text[x]
}

此方法将处理任何长度的输入（例如，如果您将文件重定向为输入），并且它避免了在 cmets 中识别的潜在溢出问题 Weather Vane。缺点是您必须从控制台手动发送 EOF 信号以进行交互式输入（根据您的平台使用 Ctrl-z 或 Ctrl-d）。

您可以在switch 中折叠一些测试，例如

case '.' :     // Each of these cases "falls through"
case '!' :     // to the following case.
case '?' :
  words++;     // the end of a sentence is also the end of a word
  sentences++;
  break;

您需要添加大小写来处理换行符和制表符：

case ' ' :
case '\n' :
case '\t' :
  words++;
  break;

除非您不想因为重复的空白字符而碰撞words 计数器，或者如果之前的非空白字符是标点字符。所以你需要一个额外的变量来跟踪先前读取的字符的类：

enum {NONE, TEXT, PUNCT, WHITE} class = NONE;
...
while ( ( c = getchar() ) != EOF )
{
  switch( c )
  {
    case ' ' :
    case '\n' :
    case '\t' :
      if ( class == TEXT )
        words++;
      class = WHITE;
      break;
    
    case '.' :
    case '!' :
    case '?' :
      if ( class == TEXT )  // Don’t bump the word counter
        words++;            // if the previous character was
                            // was whitespace or .! ?
      if ( class != PUNCT ) // Don’t bump the sentence counter
        sentences++;        // for repeating punctuation
      
      class = PUNCT;
      break;
      ...
  }
}

仍然会有奇怪的极端情况，这不会给出完全准确的计数，但对于大多数输入来说应该足够好。

你应该能够从那里找出其余的。

^{CS50 字符串处理和 I/O 例程（如 get_string）非常巧妙，但它们严重歪曲了 C 的实际工作方式。 string typedef 尤其令人震惊，因为它的别名是不是字符串。请注意，这些工具在 CS50 课程之外将不可用，因此不要过于依赖它们。}

【讨论】：

是的，我想如果使用 get_char() 但 cs50 检查系统要求我按照他们所说的方式完全执行分配。感谢您的帮助！