如果修改信号处理程序中的 ctx.rip 和 ctx.rsp 会发生什么答案

【问题标题】：What happens if ctx.rip and ctx.rsp in a signal handler is modified如果修改信号处理程序中的 ctx.rip 和 ctx.rsp 会发生什么
【发布时间】：2019-12-08 14:57:10
【问题描述】：

众所周知，程序被信号中断并进入内核空间然后切换到用户空间信号处理程序。信号处理完成后，它会重新进入内核空间，然后切换回中断的地方。

我最近正在阅读 go 1.14 中新实现的异步抢占，它使用操作系统信号来中断“非抢占”用户 goroutine。我正在调试非常简单的程序：

package main

import (
    "runtime"
    "time"
)

func tightloop() {
    for {
    }
}

func main() {
    runtime.GOMAXPROCS(1)
    go tightloop()

    time.Sleep(time.Millisecond)
    println("OK")
    runtime.Gosched()
}

在 Go 1.14 中，当抢占信号到达时，tightloop 将被操作系统中断并进入预配置的信号处理程序runtime·sigtramp：

TEXT runtime·sigtramp(SB),NOSPLIT,$72
    MOVQ    DX, ctx-56(SP)
    MOVQ    SI, info-64(SP)
    MOVQ    DI, signum-72(SP)
    MOVQ    $runtime·sigtrampgo(SB), AX
    CALL AX
    RET

其中sigtrampgo 最终调用sighandler。

//go:nosplit
//go:nowritebarrierrec
func sigtrampgo(sig uint32, info *siginfo, ctx unsafe.Pointer) {
    (...)
    setg(g.m.gsignal)
    (...)
    sighandler(sig, info, ctx, g)
    setg(g)
    (...)
}

据我阅读sighandler函数，它调用doSigPreempt并修改从系统内核传递的ctx，并将rip设置为runtime.asyncPreempt的序言。

//go:nowritebarrierrec
func sighandler(sig uint32, info *siginfo, ctxt unsafe.Pointer, gp *g) {
    _g_ := getg()
    c := &sigctxt{info, ctxt}

    (...)
    if sig == sigPreempt {
        doSigPreempt(gp, c)
    }
}
func doSigPreempt(gp *g, ctxt *sigctxt) {
    if canPreempt {
        // here modifies the rip and rsp
        ctxt.pushCall(funcPC(asyncPreempt))
    }

    (...)
}

但是，我注意到 asyncPreempt 不会立即执行信号处理程序已完成，而是：

morestack 或 morestack_noctxt 在sighandler 被返回（不进入结尾或序言）之后调用，它调用newstack 并检查检查抢占标志和进入调度循环，因此调度主 goroutine 完成异步抢占。
执行asyncPreempt之前的OK输出

这是我在运行时插入的打印日志：

mstart1 call schedule()
enter schedule()
park_m call schedule()
enter schedule()
mstart1 call schedule()
enter schedule()
mstart1 call schedule()
enter schedule()
park_m call schedule()
enter schedule()
park_m call schedule()
enter schedule()
park_m call schedule()
enter schedule()
mstart1 call schedule()
enter schedule()
park_m call schedule()
enter schedule()
rip: 17149264 eip: 824634034136
before pushCall asyncPreempt
after pushCall asyncPreempt
rip: 17124704 eip: 824634034128      // rip points to asyncPreempt
calling newstack: m0, g0             // how could newstack is called?
newstack call gopreempt_m
gopreempt_m call goschedImpl
goschedImpl call schedule()
enter schedule()
OK
gosched_m call goschedImpl
goschedImpl call schedule()
enter schedule()
asyncPreempt2
asyncPreempt2
asyncPreempt2
asyncPreempt2
preemptPark
gopreempt_m call goschedImpl
goschedImpl call schedule()
enter schedule()

当我检查转储的汇编代码时，没有堆栈拆分检查既不是asyncPreempt 也不是sigtramp。

对不起，长话短说，我的问题是：

在sighandler 之后，运行时何时、谁以及如何调用morestack？我错过了什么？
修改ctx是否会改变程序在完成信号处理程序后跳转到修改后的rip指令？

非常感谢您阅读该问题，并感谢 go 团队构建了如此出色的功能。

【问题讨论】：

我没有看过 Go 1.14 的代码，但一般来说，Linux/Unix 信号处理系统通过在进程预先设置的信号堆栈上传递信号来工作，或者在当前stack 如果它没有预先设置一个信号堆栈。为了中断正在运行的程序，内核不会让内核自己的中断/故障处理程序返回到被中断/出错的用户指令，而是将内核上下文写入信号堆栈帧，然后适当地设置程序计数器并“返回”给用户信号蹦床。这个蹦床负责（续）
... 获取信号并最终使用sigreturn 系统调用（或类似的）来恢复寄存器值并返回用户代码。通常，要在用户级调度程序中重新调度，您将在首先保存中断的 PC/寄存器之后更改 sigreturn 数据结构保存的 PC（和任何其他特殊寄存器），以便可以恢复被中断的事物。在这种情况下，这对应于设置 ctx rip，因此或多或少是您的建议。
@torek 如果我理解正确，内核确实可以恢复修改后的寄存器，并且可以从中断位置以外的其他地方开始执行。如果是这样，那么它变得非常有趣：Go 运行时中的信号处理程序将rip 设置为asyncPreempt，但在信号处理程序返回后不执行它。相反，它调用morestack。有什么想法吗？
我敢打赌，在某个地方还有一些其他样板文件可以确保新启动的 goroutine 首先调用morestack。究竟在哪里，嗯，这往往是棘手的。
@torek，这似乎是错误的。我检查了生成的程序集，asyncPreempt 调用中没有序言，这意味着如果内核恢复到asyncPreempt，它应该立即开始调用它。

标签： go signals

【解决方案1】：

我已经弄明白了，非常感谢 Ian 的提示：

https://groups.google.com/forum/#!topic/golang-nuts/BA7Dqp_zcwk

根本原因似乎类似于“不确定性原则”。

作为观察者，通过在asyncPreempt 中添加println 调用以及asyncPreempt2 影响实际行为信号处理后。 println 涉及堆栈拆分检查，它调用morestack。

我花了一段时间才意识到morestack 存储了它的调用者电脑在g.m.morebuf.pc 自getcallerpc 在newstack 总是从morestack 返回电脑，这并不能说明信息太多。

//go:nosplit
func asyncPreempt2() {
    // println("asyncPreempt2 is called") // comment here omits calling morestack.
    gp := getg()
    gp.asyncSafePoint = true
    if gp.preemptStop {
        mcall(preemptPark)
    } else {
        mcall(gopreempt_m)
    }
    println("asyncPreempt2 finished")
    gp.asyncSafePoint = false
}

TLDR：在信号处理程序之后，内核恢复asyncPreempt的rip并直接切换到它，runtime.asyncPreempt和runtime.sigtramp之间没有任何反应。

【讨论】：