qemu 跟踪哪些指令？答案

【问题标题】：What instructions does qemu trace?qemu 跟踪哪些指令？
【发布时间】：2020-11-15 17:01:30
【问题描述】：

我编写了以下一段代码，它逐步遍历 /bin/ls 并计算其指令：

#include <stdio.h>
#include <sys/ptrace.h>
#include <sys/types.h>
#include <sys/wait.h>
#include <unistd.h>
#include <sys/user.h>
#include <sys/reg.h>    
#include <sys/syscall.h>

int main()
{   
    pid_t child;
    child = fork(); //create child
    
    if(child == 0) {
        ptrace(PTRACE_TRACEME, 0, NULL, NULL);
        char* child_argv[] = {"/bin/ls", NULL};
        execv("/bin/ls", child_argv);
    }
    else {
        int status;
        long long ins_count = 0;
        while(1)
        {
            //stop tracing if child terminated successfully
            wait(&status);
            if(WIFEXITED(status))
                break;

                ins_count++;
                ptrace(PTRACE_SINGLESTEP, child, NULL, NULL);
        }

    printf("\n%lld Instructions executed.\n", ins_count);

    }
    
    return 0;
}

运行这段代码可以让我执行大约 500.000 条指令。据我所知，这些指令中的大部分应该来自动态链接器。当我使用带有 qemu-x86_64 -singlestep -D log -d in_asm /bin/ls 的 QEMU 跟踪 /bin/ls 时，我执行了大约 17.000 条指令。我必须调整什么才能在 QEMU 所做的相同点开始和停止计数？（又名。计算相同的指令）。

我用 QEMU 跟踪了一个“return null”程序，它产生了 7840 条指令，而我的代码给了我 109025，因此 QEMU 似乎跟踪的比 main 多，但比我的代码少。

我的目标是稍后比较这些指令，这就是为什么我想迭代像 QEMU 这样的相同指令。

【问题讨论】：

有点像 ptrace 计算内核中的周期，而 qemu 没有。无论如何，17.000 个周期对于完成所有内核工作来说有点少。

标签： c linux qemu instructions ptrace

【解决方案1】：

QEMU 的“in_asm”日志不是执行指令的日志。每次翻译指令时（即当 QEMU 生成与它对应的主机代码位时）它都会记录下来。然后缓存该翻译，如果客户循环并再次执行相同的指令，QEMU 将简单地重新使用相同的翻译，因此它不会被 in_asm 记录。因此，“in_asm 报告的指令要少得多”是意料之中的。

通过 -d 选项记录每条执行的指令有点棘手——你需要查看“cpu”和“exec”跟踪，使用 -d 的“nochain”子选项来禁用 QEMU 优化否则会导致某些块没有被记录，使用'-singlestep'强制每个块执行一条指令，并且还要考虑一些我们打印执行跟踪然后实际上不执行指令的极端情况。这是因为 -d 选项并非旨在让用户自省其程序的行为——它是一个调试选项，旨在允许调试 QEMU 和来宾程序一起执行的操作，因此它打印的信息是需要对 QEMU 内部有一点了解才能正确解释。

您可能会发现编写 QEMU“插件”更简单：https://qemu.readthedocs.io/en/latest/devel/tcg-plugins.html——这是一个 API，旨在相当简单地编写诸如“执行计数指令”之类的工具。如果你很幸运，那么其中一个示例插件甚至可能足以满足您的目的。

【讨论】：

你熟悉qemu源码吗？我只是查看了 src 并找到了函数 cpu_exec_nocache 执行代码而不缓存。那不会解决我重复使用指令的问题吗？你知道我需要在哪里用 no nocache 函数替换正常的执行函数吗？
这是 QEMU 内部的一部分——它在某些奇怪的极端情况下被调用，在这种情况下需要执行一些代码而不缓存 TB。根本不支持运行没有 TB 缓存的代码。

【解决方案2】：

我修改了您的程序，使其在专用 CPU 内核（例如 7 号）上运行，在 fork() 之前添加了以下代码：

#define _GNU_SOURCE
#include <sched.h>
[...]
  cpu_set_t set;
  int rc;

  CPU_ZERO(&set);
  CPU_SET(7, &set);

  // Migrate the calling process on the target cpu
  rc = sched_setaffinity(0, sizeof(cpu_set_t), &set);
  if (0 != rc) {
    fprintf(stderr, "sched_setaffinity(): '%m' (%d)\n", errno);
    return -1;
  }

  // Dummy system call to trigger the migration. Actually, the on line
  // manual says that the previous call will make the current process
  // migrate but I saw in cpuid's source code that the guy calls sleep(0)
  // to make sure that the migration will be done. In my opinion, it may
  // be safer to call sched_yield()
  rc = sched_yield();
  if (0 != rc) {
    fprintf(stderr, "sched_yield(): '%m' (%d)\n", errno);
    return -1;
  }

  // Create child
  child = fork();
[...]

我的电脑正在运行 Ubuntu/Linux 5.4.0：

# Intel(R) Core(TM) i7-3770K CPU @ 3.50GHz
# Code name     : Ivy Bridge
# cpu family    : 6
# model     : 58
# microcode : 0x21
# Number of physical cores: 4
# Number of harware threads: 8
# Base frequency: 3,50 GHz
# Turbo frequency: 3,90 GHz
# cpu MHz: 1604.615
# cache size    : 8192 KB
# cache_alignment: 64
# Address sizes: 36 bits physical, 48 bits virtual
#
# PMU version: 3
# Maximum number of fixed counters: 3
# Fixed counter bit width: 48
# Maximum number of programmable counters: 4
# Programmable counter bit width: 48

如果我在激活 ptrace() 的情况下启动修改后的程序，我得到的数字几乎和你的一样：

$ test/progexec
[...]
548765 Instructions executed.

我设计了一个读取英特尔 PMU 计数器的工具。固定计数器#0 是：

# INST_RETIRED.ANY
#
# Number of instructions that retire execution. For instructions that consist of multiple
# uops, this event counts the retirement of the last uop of the instruction. The counter
# continues counting during hardware interrupts, traps, and in-side interrupt handlers.
#

在程序运行的 CPU core#7 上读取上述计数器会得到以下结果：

1871879 用户 + 内核空间执行指令（环 0-3）
546874 用户空间执行指令（环 3）
1324451 内核空间执行指令（环 0）

所以，根据上面的数字，带有ptrace(PTRACE_SINGLESTEP)的程序计算程序在用户空间运行时的指令数（Intel保护环# 3).

注意：Linux 使用 ring 0 作为内核空间，使用 ring 3 作为用户空间。

【讨论】：