【发布时间】:2020-11-10 13:19:03
【问题描述】:
我有一个使用 c++ substr 的函数。在 Linux 和 macOS 上,此函数有时会在程序结束时(从单独的线程调用时)随机崩溃。
函数如下:
bool mkpath( string path )
{
bool bSuccess = false;
int nRC = ::mkdir( path.c_str(), 0775 );
if( nRC == -1 )
{
switch( errno )
{
case ENOENT:
//parent didn't exist, try to create it
if( mkpath( path.substr(0, path.find_last_of('/')) ) )
{
//Now, try to create again.
int status = ::mkdir( path.c_str(), 0775 );
bSuccess = (0 == status || errno == EEXIST);
}
else
{
bSuccess = false;
}
break;
case EEXIST:
//Done!
bSuccess = true;
break;
default:
bSuccess = false;
break;
}
}
else
bSuccess = true;
return bSuccess;
}
lldb回溯如下:
* thread #4, stop reason = EXC_BAD_ACCESS (code=2, address=0x70000097fff8)
* frame #0: 0x00007fff73d2a81a libsystem_malloc.dylib`tiny_malloc_from_free_list + 8
frame #1: 0x00007fff73d2a297 libsystem_malloc.dylib`tiny_malloc_should_clear + 288
frame #2: 0x00007fff73d290c6 libsystem_malloc.dylib`szone_malloc_should_clear + 66
frame #3: 0x00007fff73d27d7a libsystem_malloc.dylib`malloc_zone_malloc + 104
frame #4: 0x00007fff73d27cf5 libsystem_malloc.dylib`malloc + 21
frame #5: 0x00007fff70ea0dea libc++abi.dylib`operator new(unsigned long) + 26
frame #6: 0x00007fff70e73d70 libc++.1.dylib`std::__1::basic_string<char,
std::__1::char_traits<char>, std::__1::allocator<char>
>::basic_string(std::__1::basic_string<char, std::__1::char_traits<char>,
std::__1::allocator<char> > const&, unsigned long, unsigned long, std::__1::allocator<char>
const&) + 132
frame #7: 0x0000000103821467 libSampleLibrary.dylib`std::__1::basic_string<char,
std::__1::char_traits<char>, std::__1::allocator<char> >::substr(unsigned long, unsigned
long) const + 87
frame #8: 0x0000000103820c77 libSampleLibrary.dylib`mkpath(std::__1::basic_string<char,
std::__1::char_traits<char>, std::__1::allocator<char> >) + 135
frame #9: 0x0000000103820c80 libSampleLibrary.dylib`mkpath(std::__1::basic_string<char,
std::__1::char_traits<char>, std::__1::allocator<char> >) + 144
frame #10: 0x0000000103820c80 libSampleLibrary.dylib`mkpath(std::__1::basic_string<char,
std::__1::char_traits<char>, std::__1::allocator<char> >) + 144
frame #11: 0x0000000103820c80 libSampleLibrary.dylib`mkpath(std::__1::basic_string<char,
std::__1::char_traits<char>, std::__1::allocator<char> >) + 144
frame #12: 0x0000000103820c80 libSampleLibrary.dylib`mkpath(std::__1::basic_string<char,
std::__1::char_traits<char>, std::__1::allocator<char> >) + 144
frame #13: 0x0000000103820c80 libSampleLibrary.dylib`mkpath(std::__1::basic_string<char,
std::__1::char_traits<char>, std::__1::allocator<char> >) + 144
frame #14: 0x0000000103820c80 libSampleLibrary.dylib`mkpath(std::__1::basic_string<char,
std::__1::char_traits<char>, std::__1::allocator<char> >) + 144
frame #15: 0x0000000103820c80 libSampleLibrary.dylib`mkpath(std::__1::basic_string<char,
std::__1::char_traits<char>, std::__1::allocator<char> >) + 144
...
发生这种情况的任何可能原因?正如我们添加的那样,调用它的线程可以从外部取消
pthread_setcanceltype(PTHREAD_CANCEL_ASYNCHRONOUS, NULL);
在函数中调用该函数的第一行。
每当发生这种崩溃时,我都会在回溯中看到以下行超过 5400 次,这非常令人惊讶:
frame #15: 0x0000000103820c80 libSampleLibrary.dylib`mkpath(std::__1::basic_string<char,
std::__1::char_traits<char>, std::__1::allocator<char> >) + 144
我进一步调试,结果发现一个静态变量(homePath)被破坏(返回垃圾)。 mkpath() 函数从以下函数中获取它的值:
string GetSettingFilePath()
{
static string homePath = "";
if(!homePath.empty()){
// sometimes homePath variable returns junk at the program exit
return homePath;
}
struct passwd* pwd = getpwuid(getuid());
if (pwd)
{
homePath = pwd->pw_dir;
}
else
{
// try the $HOME environment variable
homePath = getenv("HOME");
}
if (homePath.empty())
{
homePath = "./";
}
return homePath
}
由于 homePath 变量虽然设置正确,但有时在程序退出时返回垃圾,导致无限递归。但是为什么在程序退出时调用函数内部的这个静态变量会返回垃圾。
【问题讨论】:
-
malloc/new 中的崩溃通常表示堆损坏,我们需要minimal reproducible example 来进一步帮助,损坏可能发生在代码崩溃之前
-
问题不在 substr 调用中。
path.substr(0, path.find_last_of('/'))这里可能发生的是找不到/并且find_last_of返回string::npos和substr(0,npos)只是原始字符串。根本原因在别处 -
我已经在底部的答案中进行了编辑,也许会有所帮助
-
程序中的其他地方可能超出了某个数组或其他内存的范围,从而破坏了变量。使用Valgrind 等工具来帮助您查找内存损坏和类似错误。
-
如果您的程序正在退出,那么
homePath可能已经被销毁,请提供minimal reproducible example