【发布时间】:2018-04-19 04:57:58
【问题描述】:
我正在尝试为 Linux 上的 POSIX 计时器系统编写 C++“包装器”,以便我的 C++ 程序可以使用系统时钟设置超时(例如等待消息通过网络到达),无需处理 POSIX 丑陋的 C 接口。它似乎大部分时间都可以工作,但偶尔我的程序会在成功运行几分钟后出现段错误。问题似乎是我的 LinuxTimerManager 对象(或其成员对象之一)的内存已损坏,但不幸的是,如果我在 Valgrind 下运行该程序,该问题不会出现,所以我一直盯着我的代码试图弄清楚看看有什么问题。
这是我的计时器包装器实现的核心:
LinuxTimerManager.h:
namespace util {
using timer_id_t = int;
class LinuxTimerManager {
private:
timer_id_t next_id;
std::map<timer_id_t, timer_t> timer_handles;
std::map<timer_id_t, std::function<void(void)>> timer_callbacks;
std::set<timer_id_t> cancelled_timers;
friend void timer_signal_handler(int signum, siginfo_t* info, void* ucontext);
public:
LinuxTimerManager();
timer_id_t register_timer(const int delay_ms, std::function<void(void)> callback);
void cancel_timer(const timer_id_t timer_id);
};
void timer_signal_handler(int signum, siginfo_t* info, void* ucontext);
}
LinuxTimerManager.cpp:
namespace util {
LinuxTimerManager* tm_instance;
LinuxTimerManager::LinuxTimerManager() : next_id(0) {
tm_instance = this;
struct sigaction sa = {0};
sa.sa_flags = SA_SIGINFO;
sa.sa_sigaction = timer_signal_handler;
sigemptyset(&sa.sa_mask);
int success_flag = sigaction(SIGRTMIN, &sa, NULL);
assert(success_flag == 0);
}
void timer_signal_handler(int signum, siginfo_t* info, void* ucontext) {
timer_id_t timer_id = info->si_value.sival_int;
auto cancelled_location = tm_instance->cancelled_timers.find(timer_id);
//Only fire the callback if the timer is not in the cancelled set
if(cancelled_location == tm_instance->cancelled_timers.end()) {
tm_instance->timer_callbacks.at(timer_id)();
} else {
tm_instance->cancelled_timers.erase(cancelled_location);
}
tm_instance->timer_callbacks.erase(timer_id);
timer_delete(tm_instance->timer_handles.at(timer_id));
tm_instance->timer_handles.erase(timer_id);
}
timer_id_t LinuxTimerManager::register_timer(const int delay_ms, std::function<void(void)> callback) {
struct sigevent timer_event = {0};
timer_event.sigev_notify = SIGEV_SIGNAL;
timer_event.sigev_signo = SIGRTMIN;
timer_event.sigev_value.sival_int = next_id;
timer_t timer_handle;
int success_flag = timer_create(CLOCK_REALTIME, &timer_event, &timer_handle);
assert(success_flag == 0);
timer_handles[next_id] = timer_handle;
timer_callbacks[next_id] = callback;
struct itimerspec timer_spec = {0};
timer_spec.it_interval.tv_sec = 0;
timer_spec.it_interval.tv_nsec = 0;
timer_spec.it_value.tv_sec = 0;
timer_spec.it_value.tv_nsec = delay_ms * 1000000;
timer_settime(timer_handle, 0, &timer_spec, NULL);
return next_id++;
}
void LinuxTimerManager::cancel_timer(const timer_id_t timer_id) {
if(timer_handles.find(timer_id) != timer_handles.end()) {
cancelled_timers.emplace(timer_id);
}
}
}
当我的程序崩溃时,段错误总是来自timer_signal_handler(),通常是tm_instance->timer_callbacks.erase(timer_id) 或tm_instance->timer_handles.erase(timer_id) 行。实际的段错误是从std::map 实现的某个深处抛出的(即stl_tree.h)。
我的内存损坏是否是由修改同一个 LinuxTimerManager 的不同计时器信号之间的竞争条件引起的?我以为一次只发送一个计时器信号,但也许我误解了手册页。让 Linux 信号处理程序修改像 std::map 这样的复杂 C++ 对象通常是不安全的吗?
【问题讨论】:
-
唯一可以安全地从信号处理程序调用的是async-signal-safe functions。您的信号处理程序访问的对象不一定处于一致状态。
-
信号处理程序只能调用异步信号安全函数,除非一个函数被明确列出,而 STL 成员函数不是(特别是如果它们进行内存分配),你应该假设它是不是异步信号安全功能。您可以做的是设置一个 volatile 标志,然后在常规程序上下文中检查它并在那里执行您的操作。