【问题标题】:Reading strings in C在 C 中读取字符串
【发布时间】:2011-11-01 04:52:56
【问题描述】:

如果我使用 C 获取(),并且我正在从用户那里读取一个字符串,但我不知道我需要多大的缓冲区,并且输入可能非常大。 有没有办法可以确定用户输入的字符串有多大,然后分配内存,然后将其放入变量中?或者至少是一种在不知道它有多大的情况下接受输入的方法,它可能不适合我已经分配的缓冲区。

【问题讨论】:

    标签: c input size buffer gets


    【解决方案1】:

    不适用于gets()请改用fgets()

    您无法使用gets() 安全地获取用户输入。

    您需要在循环中使用fgets()(或fgetc())。

    【讨论】:

    • 另请注意,gets() 将在下一个 C 标准 IIRC 中弃用。
    • 即使使用 fgets(),缓冲区也有可能不足以容纳输入。如果我从标准输入读取,循环中的 fgetc() 将不起作用,除非我做的完全错误。
    • 你需要在循环内realloc
    【解决方案2】:

    不要使用gets()。使用fgets(),并估计您需要多少缓冲空间。

    fgets 的优点是,如果你过去,它只会写入最大数量的字符,并且不会破坏程序另一部分的内存。

    char buff[100];
    fgets(buff,100,stdin);
    

    最多只能读取 99 个字符或直到遇到 `'\n'。如果有空间,它会将换行符读入数组。

    【讨论】:

      【解决方案3】:

      我认为使用适当大的中间缓冲区,并通过将字符串长度限制为最大缓冲区大小,使用fgets 或其他函数将字符串输入其中。稍后当输入字符串时,。计算字符串长度并分配字符串大小的缓冲区并将其复制到新分配的缓冲区中。旧的大缓冲区可以重用于此类输入。

      你可以这样做:

      fgets (buffer, BUFSIZ, stdin);

      scanf ("%128[^\n]%*c", buffer);

      在这里您可以将缓冲区长度 128 字节指定为 %128.. 并且还包括字符串中的所有空格。

      然后计算长度并分配新的缓冲区:

      len = strlen (buffer);
      string = malloc (sizeof (char) * len + 1);
      strcpy (string, buffer);
      .
      .
      .
      free (string);
      

      编辑

      这是我制定的一种方法:

      #include <stdio.h>
      #include <stdlib.h>
      #include <string.h>
      
      int main (void)
      {
        char *buffer[10];  /* temporary buffers 10 nos, or make this dynamically allocated */
        char *main_str;    /* The main string to work with after input */
        int k, i=0, n, retval;
      
        while (1)
        {
          buffer[i] = malloc (sizeof (char) * 16); /* allocate buffer size 16 */
          scanf ("%15[^\n]%n", buffer[i], &n);     /* input length 15 string + 1 byte for null */
          if (n<16)                                /* Buffer is not filled and end of string reached */
            break;
          n=0;                                     /* reinitialize n=0 for next iteration. to make the process work if the length of the string is exactly the sizeof the buffer */
          i++;
        }
        /* need to fix the while loop so that the buffer array does not overflow and protect it from doing so */
      
        /* allocate buffer of exact size of the string */
        main_str = malloc (sizeof (char) * 16 * i + strlen (buffer[i]));
      
        /* copy the segmented string into the main string to be worked with 
         * and free the buffers
         */
        strcpy (main_str, "");
        for (k=0; k<=i; k++)
        {
          strcat (main_str, buffer[k]);
          free (buffer[k]);
        }
      
        /* work with main string */
        printf ("\n%s", main_str);
      
        /* free main string */
        free (main_str);
      
        return 0;
      }
      

      您需要修复代码以在某些情况下停止崩溃,但这应该可以回答您的问题。

      【讨论】:

      • 但这仍然会剪切字符串。即使我为原始字符串制作了一个非常大的缓冲区,输入仍然有可能更大。
      • 输入可以是无限的,但是你需要设置一些上限。或者您可以制作自己的输入例程,该例程将保持跟踪并根据需要分配块或使用多个缓冲区。
      • 太棒了,正是我想要的。谢谢!
      【解决方案4】:

      动态分配缓冲区并使用 fgets。如果你把缓冲区填满,那么它就不够大,所以使用 realloc 增长它,然后再次 fgets (但写入字符串的末尾以保持你已经抓取的内容)。继续这样做,直到你的缓冲区大于输入:

      buffer = malloc(bufsize);
      do{
          GotStuff = fgets(buffer, bufsize, stdin))
          buffer[bufsize-1] = 0;
          if (GotStuff && (strlen(buffer) >= bufsize-1))
          {
              oldsize = bufsize;
              buffer = realloc(bufsize *= 2);
              GotStuff = fgets( buffer + oldsize, bufsize - oldsize, stdin )
              buffer[bufsize-1] = 0;
          }
      } while (GotStuff && (strlen(buffer) >= bufsize-1));
      

      【讨论】:

        【解决方案5】:

        您用gets() 描述的问题 - 无法知道存储输入需要多大的目标缓冲区 - 正是为什么该库调用在 1999 年标准中被弃用,预计将在下一次修订中完全消失;预计大多数编译器会相对较快地效仿。一个库函数造成的混乱比破坏 40 年遗留代码的前景更可怕。

        一种解决方案是使用fgets() 和一个固定长度的缓冲区逐个读取输入,然后将其附加到一个可动态调整大小的目标缓冲区中。例如:

        #include <stdio.h>
        #include <stdlib.h>
        
        #define SIZE 512;
        
        char *getNextLine(FILE *stream, size_t *length)
        {
          char *output;
          char input[SIZE+1];
          *length = 0;
          int foundNewline = 0;
        
          /**
           * Initialize our output buffer
           */
          if ((output = malloc(1)) != NULL);
          {
            *output = 0;
            *length = 1;
          }
          else
          {
            return NULL;
          }
        
          /**
           * Read SIZE chars from the input stream until we hit EOF or
           * see a newline character
           */
          while(fgets(input, sizeof input, stream) != NULL && !foundNewline)
          {
            char *newline = strchr(input, '\n');
            char *tmp = NULL;
        
            /**
             * Strip the newline if present
             */
            foundNewline = (newline != NULL);
            if (foundNewline)
            {
              *newline = 0;
            }
        
            /**
             * Extend the output buffer 
             */
            tmp = realloc(output, *length + strlen(input));
            if (tmp)
            {
                output = tmp;
                strcat(output, input);
                *length += strlen(input);
            }
          }
          return *output;
        }
        

        调用者将负责在处理完输入后释放缓冲区。

        【讨论】:

          【解决方案6】:

          如果您在 Unix 平台上,您可能应该使用 getline(),它正是为这种事情而制作的。

          如果您的平台没有getline(),这里有一些公共域代码可以让您使用它。这篇文章有点长,但那是因为代码试图实际处理现实生活中的错误和情况(甚至是不那么现实的错误和情况,比如内存不足)。

          它可能不是性能最高的版本,也不是最优雅的版本。它使用fgetc() 逐个挑选字符,并在读取字符时将空终止符放在数据的末尾。但是,即使面对错误和大大小小的数据集,我也相信它是正确的。它的性能足以满足我的目的。

          我不是特别喜欢getline() 接口,但我使用它是因为它是某种标准。

          以下内容将在 GCC (MinGW) 和 MSVC 下编译(作为 C++ - 它使用与语句混合的声明,在编译为 C 时 MSVC 仍然不支持。也许有一天我会解决这个问题)。

          #define _CRT_SECURE_NO_WARNINGS 1
          
          #include <assert.h>
          #include <stdio.h>
          #include <stdlib.h>
          #include <string.h>
          #include <errno.h>
          #include <limits.h>
          #include <sys/types.h>
          
          
          #if !__GNUC__
          #if _WIN64
          typedef long long ssize_t;
          #else
          typedef long ssize_t;
          #endif
          #endif
          
          
          #if !defined(SSIZE_MAX)
          #define SSIZE_MAX ((ssize_t)(SIZE_MAX/2))
          #endif
          
          #if !defined(EOVERFLOW)
          #define EOVERFLOW (ERANGE)      /* is there something better to use? */
          #endif
          
          
          
          ssize_t nx_getdelim(char **lineptr, size_t *n, int delim, FILE *stream);
          ssize_t nx_getline(char **lineptr, size_t *n, FILE *stream);
          
          
          
          
          /*
              nx_getdelim_get_realloc_size()
          
              Helper function for getdelim() to figure out an appropriate new
              allocation size that's not too small or too big.
          
              These numbers seem to work pretty well for most text files.
          
              returns the input value if it decides that new allocation block
              would be too big (the caller should handle this as 
              an error).
          */
          static
          size_t nx_getdelim_get_realloc_size( size_t current_size)
          {
              enum {
                  k_min_realloc_inc = 32,
                  k_max_realloc_inc = 1024,
              };
          
              if (SSIZE_MAX < current_size) return current_size;
          
              if (current_size <= k_min_realloc_inc) return current_size + k_min_realloc_inc;
          
              if (current_size >= k_max_realloc_inc) return current_size + k_max_realloc_inc;
          
              return current_size * 2;
          }
          
          
          
          /*
              nx_getdelim_append() 
          
              a helper function for getdelim() that adds a new character to 
              the outbuffer, reallocating as necessary to ensure the character
              and a following null terminator can fit
          
          */
          static
          int nx_getdelim_append( char** lineptr, size_t* bufsize, size_t count, char ch)
          {
              char* tmp = NULL;
              size_t tmp_size = 0;
          
              // assert the contracts for this functions inputs
              assert( lineptr != NULL);
              assert( bufsize != NULL);
          
              if (count >= (((size_t) SSIZE_MAX) + 1)) {
                  // writing more than SSIZE_MAX to the buffer isn't supported
                  return -1;
              }
          
              tmp = *lineptr;
              tmp_size = tmp ? *bufsize : 0;
          
              // need room for the character plus the null terminator
              if ((count + 2) > tmp_size) {
                  tmp_size = nx_getdelim_get_realloc_size( tmp_size);
          
                  tmp = (char*) realloc( tmp, tmp_size);
          
                  if (!tmp) {
                      return -1;
                  }
              }
          
              *lineptr = tmp;
              *bufsize = tmp_size;
          
              // remember, the reallocation size calculation might not have 
              // changed the block size, so we have to check again
              if (tmp && ((count+2) <= tmp_size)) {
                  tmp[count++] = ch;
                  tmp[count] = 0;
                  return 1;
              }
          
              return -1;
          }
          
          
          /*
              nx_getdelim()
          
              A getdelim() function modeled on the Linux/POSIX/GNU 
              function of the same name.
          
              Read data into a dynamically resizable buffer until 
              EOF or until a delimiter character is found.  The returned
              data will be null terminated (unless there's an error 
              that prevents it).
          
          
          
              params:
          
                  lineptr -   a pointer to a char* allocated by malloc() 
                              (actually any pointer that can legitimately be
                              passed to free()).  *lineptr will be updated 
                              by getdelim() if the memory block needs to be 
                              reallocated to accommodate the input data.
          
                              *lineptr can be NULL (though lineptr itself cannot),
                              in which case the function will allocate any necessary 
                              buffer.
          
                  n -         a pointer to a size_t object that contains the size of 
                              the buffer pointed to by *lineptr (if non-NULL).
          
                              The size of whatever buff the resulting data is 
                              returned in will be passed back in *n
          
                  delim -     the delimiter character.  The function will stop
                              reading one this character is read form the stream.
          
                              It will be included in the returned data, and a
                              null terminator character will follow it.
          
                  stream -    A FILE* stream object to read data from.
          
              Returns:
          
                  The number of characters placed in the returned buffer, including
                  the delimiter character, but not including the terminating null.
          
                  If no characters are read and EOF is set (or attempting to read 
                  from the stream on the first attempt caused the eof indication 
                  to be set), a null terminator will be written to the buffer and
                  0 will be returned.
          
                  If an error occurs while reading the stream, a 0 will be returned.
                  A null terminator will not necessarily be at the end of the data 
                  written.
          
                  On the following error conditions, the negative value of the error 
                  code will be returned:
          
                      ENOMEM:     out of memory
                      EOVERFLOW:  SSIZE_MAX character written to te buffer before 
                                  reaching the delimiter
                                  (on Windows, EOVERFLOW is mapped to ERANGE)
          
                   The buffer will not necessarily be null terminated in these cases.
          
          
              Notes:
          
                  The returned data might include embedded nulls (if they exist
                  in the data stream) - in that case, the return value of the
                  function is the only way to reliably determine how much data
                  was placed in the buffer.
          
                  If the function returns 0 use feof() and/or ferror() to determine
                  which case caused the return.
          
                  If EOF is returned after having written one or more characters
                  to the buffer, a normal count will be returned (but there will 
                  be no delimiter character in the buffer).  
          
                  If 0 is returned and ferror() returns a non-zero value,
                  the data buffer may not be null terminated.
          
                  In other cases where a negative value is returned, the data
                  buffer is not necessarily null terminated and there 
                  is no reliable means to determining what data in the buffer is
                  valid.
          
                  The pointer returned in *lineptr and the buffer size
                  returned in *n will be valid on error returns unless
                  NULL pointers are passed in for one or more of these
                  parameters (in which case the return value will be -EINVAL).
          
          */
          ssize_t nx_getdelim(char **lineptr, size_t *n, int delim, FILE *stream)
          {
              int retval = 0;
          
              if (!lineptr || !n) {
                  return -EINVAL;
              }
          
              ssize_t result = 0;    
              char* line = *lineptr;
              size_t size = *n;
              size_t count = 0;
              int err = 0;
          
              int ch;
          
              for (;;) {
                  ch = fgetc( stream);
          
                  if (ch == EOF) {
                      break;
                  }
          
                  result = nx_getdelim_append( &line, &size, count, ch);
          
                  // check for error adding to the buffer (ie., out of memory)
                  if (result < 0) {
                      err = -ENOMEM;
                      break;
                  }
          
                  ++count;
          
                  // check if we're done because we've found the delimiter
                  if ((unsigned char)ch == (unsigned char)delim) {
                      break;
                  }
          
                  // check if we're passing the maximum supported buffer size
                  if (count > SSIZE_MAX) {
                      err = -EOVERFLOW;
                      break;
                  }
              }
          
              // update the caller's data
              *lineptr = line;
              *n = size;
          
              // check for various error returns
              if (err != 0) {
                  return err;
              }
          
              if (ferror(stream)) {
                  return 0;
              }
          
              if (feof(stream) && (count == 0)) {
                  if (nx_getdelim_append( &line, &size, count, 0) < 0) {
                      return -ENOMEM;
                  }
              }
          
              return count;
          }
          
          
          
          
          ssize_t nx_getline(char **lineptr, size_t *n, FILE *stream)
          {
              return nx_getdelim( lineptr, n, '\n', stream);
          }
          
          
          
          /*
              versions of getline() and getdelim() that attempt to follow
              POSIX semantics (ie. they set errno on error returns and
              return -1 when the stream error indicator or end-of-file
              indicator is set (ie., ferror() or feof() would return
              non-zero).
          */
          ssize_t getdelim(char **lineptr, size_t *n, char delim, FILE *stream)
          {
              ssize_t retval = nx_getdelim( lineptr, n, delim, stream);
          
              if (retval < 0) {
                  errno = -retval;
                  retval = -1;
              }
          
              if (retval == 0) {
                  retval = -1;
              }
          
              return retval;
          }
          
          ssize_t getline(char **lineptr, size_t *n, FILE *stream)
          {
              return getdelim( lineptr, n, '\n', stream);
          }
          

          【讨论】:

            猜你喜欢
            • 2014-11-21
            • 2012-10-19
            • 2018-05-02
            • 2023-04-04
            • 2014-11-22
            • 1970-01-01
            • 2017-02-16
            • 2012-11-28
            相关资源
            最近更新 更多