pthread函数realloc（）中的C ++堆损坏：下一个大小无效答案

【问题标题】：C++ Heap Corruption in pthread function realloc(): invalid next sizepthread函数realloc（）中的C ++堆损坏：下一个大小无效
【发布时间】：2020-10-26 22:47:34
【问题描述】：

您好，我有一个在 Fedora25 上运行的跨平台 C++ 应用程序，它在执行大约一天后会崩溃，并出现错误 realloc(): invalid next size。

我已将问题缩小到特定 pthread 上，该 pthread 定期向连接的客户端发送更新并清空传出消息队列。我在线程调用的函数内部为 char * 分配空间，我在发送后释放了空间。我通常不使用 C++，所以我在后台做 std::string 的东西，然后在需要时转换为 char *。我想确保我没有遗漏一些简单的东西以及有关如何重组或解决此问题的任何提示。

static void* MyPThreadFunc(void * params) {
  assert(params);
    MyAppServer *pAppServer = (MyAppServer *)params;    
    if(pAppServer != NULL) {    
       int loopCounter = 1;
       char* tempBuf;
       int tempBufLen;
       int tempDatSetDelay;
       
       while(true) {
          for(int i=0; i<pAppServer->GetUpdateDataSetCount();i++) {
             tempDatSetDelay = pAppServer->GetDataSetDelay(pAppServer->VecDatSets[i].name);
             if(tempDataSetDelay == 1 ||(tempDataSetDelay > 0 && loopCounter % tempDataSetDelay == 0)) {
                pAppServer->UpdateDataSetMsgStr(pAppServer->VecDataSets[i]);
                tempBuf = (char*)pAppServer->GetDataSetMsgStr(i); //returns const char*
                broadcast(pAppServer->Con,mg_mk_str(tempBuf));
                delete [] tempBuf;
             }//if           
          } //for
          
          //empty outgoing queue
          tempBuf = pAppServer->OUtgoingMsgQueue.peek(tempMsgLen);
          while(tempMsgLen>0) {
             broadcast(pAppServer->Con,mg_mk_str(tempBuf));
             pAppServer->OUtgoingMsgQueue.dequeue();             
             delete [] tempBuf;          
             
             tempBuf = pAppServer->OUtgoingMsgQueue.peek(tempMsgLen);                        
          }
                  
          sleep(1);
          loopCounter = loopCounter==std::numeric_limits<int>::max() ? 1 : ++loopCounter;   
       } //while       
       pAppServer=0;
    }
}

const char* AppServer::GetDataSetMsgStr(const int idx) {
        pthread_mutex_lock(&mLock);
        // Dynamically allocate memory for the returned string
        char* ptr = new char[VecDataSets[idx].curUpdateMsg.size() + 1]; // +1 for terminating NUL

        // Copy source string in dynamically allocated string buffer
        strcpy(ptr, VecDataSets[idx].curUpdateMsg.c_str());

        pthread_mutex_unlock(&mLock);
        // Return the pointer to the dynamically allocated buffer
        return ptr;
}
    
char* MsgQueue::peek(int &len) {
   char* myBuffer  = new char[512];
   len = 0;
   
   pthread_mutex_lock(&mLock);
   if(front==NULL) {
      len = -1;
          pthread_mutex_unlock(&mLock);
      return myBuffer;
     }
     
     len = front->len;
   strncpy(myBuffer,front->chars,len);
   pthread_mutex_unlock(&mLock);
   
   return myBuffer;
}

【问题讨论】：

我认为您不应该使用任何 new[] 或 delete[] 调用，而应在整个代码中使用 std::string。仅在函数需要时才使用const char *，这可以通过使用c_str() 成员函数轻松实现。你也有这个：tempBuf = pAppServer->OUtgoingMsgQueue.peek(tempMsgLen);，并且没有迹象表明该函数是否使用new[] 来分配内存（你稍后调用delete [] tempBuf）。
嗨@PaulMcKenzie 感谢您的意见。实际上，我在代码 sn-p 的底部使用 peek() 函数。
在 MsgQueue::peek() 中，如果 front->len > 511 会发生什么？
@user3583535 在peek 中，您过早地分配内存，甚至不知道是否有必要这样做。然后你假设它只分配了 512 个字节，但情况可能并非如此（正如前面的 cmets 所指出的那样），最后，分配不受互斥体保护，因此不是线程安全的。
这对 peek(() 有用吗？char* MsgQueue::peek(int &len) { pthread_mutex_lock(&mLock); len = 0; if(front==NULL) { len = -1; char* emptyBuffer = new char[1]; pthread_mutex_unlock(&mLock); return emptyBuffer; } len = front->len; char* myBuffer = new char[len+1]; strncpy(myBuffer,front->chars,len); pthread_mutex_unlock(&mLock); return myBuffer; }

标签： c++ heap-corruption char-pointer

【解决方案1】：

我不知道这是否能解决您的问题，但您的 peek 函数有几个问题：

问题 1：过早分配内存。

函数的前两行无条件地这样做：

   char* myBuffer  = new char[512];
   len = 0;

但稍后，您的函数可以检测到 len == -1，因此根本不需要分配内存。在您确定有分配内存的理由之前，不应进行分配。

问题 2：在互斥体之外分配内存。

与上述问题 1 相关，您在互斥体建立之前分配内存。如果两个或多个线程尝试调用peek，则两个线程都有可能分配内存，从而导致竞争条件。

所有分配都应该在互斥体下完成，而不是在它被设置之前。

pthread_mutex_lock(&mLock);
// now memory can be allocated

问题 3：返回分配的内存出错。

由于peek函数返回一个指向分配内存的指针，然后调用者在这个指针上调用delete []，返回一个nullptr是完全有效的，而不必分配任何内存。

在您当前的peek 函数中，即使出现错误，您也会返回分配的内存。这是此代码的建议重写：

if(front==NULL) 
{
   len = -1;
   pthread_mutex_unlock(&mLock);
   return nullptr;
}
char* myBuffer = new char[len];
//...

请注意，在确定有分配内存的理由之前，无需进行分配。

问题 4：假设有 512 个字节要分配。

如果len 大于 512 怎么办？您假设要分配的内存长度为 512，但如果 len 大于 512 会怎样？相反，您应该使用len 来确定要分配多少字节，而不是硬编码的 512。

问题 5：不使用 RAII 来控制互斥体。

如果在peek 的中间抛出异常（如果使用new[] 会发生这种情况）怎么办？您将锁定一个互斥锁，如果该线程正在等待解锁互斥锁，那么您现在有一个停滞的线程。

使用 RAII 来控制锁，其中基本上使用了一个小结构，其中析构函数自动调用互斥锁的 unlock 函数。这样，无论peek返回的原因是什么，互斥体都会自动解锁。

在 C++ 11 中，std::lock_guard 使用 C++ 11 线程模型完成此任务。如果由于某种原因您必须坚持使用 pthread，您可以创建自己的 RAII 包装器：

struct pthread_lock_guard 
{
   pthread_mutex_lock* plock; 
   pthread_lock_guard(pthread_mutex_lock* p) : plock(p) {}
   ~pthread_lock_guard() { pthread_mutex_unlock(plock); }
};

然后你会这样使用它：

char* MsgQueue::peek(int &len) 
{
   pthread_mutex_lock(&mLock);
   pthread_lock_guard lg(&mlock);
   char* myBuffer  = new char[512];
   len = 0;
   
   pthread_mutex_lock(&mLock);
   if(front==NULL) {
      len = -1;
      return myBuffer;
     }
     
     len = front->len;
   strncpy(myBuffer,front->chars,len);
   return myBuffer;
}

忽略指出的其他问题，您会看到没有更多调用来解锁互斥锁。当本地 lg 对象被销毁时，这一切都得到了解决。

所以考虑到所有这些问题，这里是 peek 函数的最终重写：

struct pthread_lock_guard 
{
   pthread_mutex_lock* plock; 
   pthread_lock_guard(pthread_mutex_lock* p) : plock(p) {}
   ~pthread_lock_guard() { pthread_mutex_unlock(plock); }
};

char* MsgQueue::peek(int &len) 
{
   pthread_mutex_lock(&mLock);
   pthread_lock_guard lg(&mLock);
   if(front==NULL) 
   {
      len = -1;
      return nullptr;
   }
   len = front->len;
   char* myBuffer  = new char[len];
   strncpy(myBuffer,front->chars,len);
   return myBuffer;
}

请注意，这还没有被编译，所以请原谅任何编译器错误。另请注意，这甚至可能无法解决您现在看到的问题。以上是为了说明您在原始问题中向我们展示的当前代码中的所有潜在缺陷，这些缺陷可能会导致您看到的错误。

最后一点，您确实应该尽可能多地使用容器类，例如std::vector<char> 或std::string，这样您就不会使用太多的原始动态内存处理。

【讨论】：

不幸的是我使用的是 C++98 我相信它被称为所以不能做 RAII 。感谢您的意见@PaulMcKenzie !!
另外，我在答案中说明了这一点——如果由于某种原因你不得不坚持使用 pthreads，你可以创建自己的 RAII 包装器：——“一些原因”包括如果您使用的是 C++ 98。

【解决方案2】：

我怀疑是因为内存泄漏。

考虑下面的while循环。

  //empty outgoing queue
  tempBuf = pAppServer->OUtgoingMsgQueue.peek(tempMsgLen);
  while(tempMsgLen>0) {
     broadcast(pAppServer->Con,mg_mk_str(tempBuf));
     pAppServer->OUtgoingMsgQueue.dequeue();             
     delete [] tempBuf;          
     
     tempBuf = pAppServer->OUtgoingMsgQueue.peek(tempMsgLen);                        
  }

tempMsgLen<=0 时，由于您没有释放内存，因此总是存在 512 字节的泄漏。即每当循环中断或即使您的线程没有进入循环时，也会发生泄漏。

作为一种快速解决方法，您可以在此 while 循环之后添加 delete [] tempBuf; 并尝试一下。

或者把while循环改成while循环。确保 peek() 的调用次数等于删除次数。

【讨论】：

感谢@MayurK 我觉得我应该做一个 Do-While 循环，因为我总是需要至少运行一次，但不知道 C++ 中是否存在该循环。无论如何，我仍然会有可能导致我希望的堆损坏的小泄漏。