有状态 CLR 委托如何封送至仅采用函数指针的本机 C 函数答案

【问题标题】：How are stateful CLR delegates marshaled to native C functions which only take a function pointer有状态 CLR 委托如何封送至仅采用函数指针的本机 C 函数
【发布时间】：2017-08-10 15:24:21
【问题描述】：

（替代标题：如何在 C 或 C++ 中实现等效于 CLR 委托）

考虑这个 C 函数：

int Test(int(*fn1)(double a));

如果我要从 C 程序调用这个函数，我将无法将任意状态对象与我的函数指针一起传递 - 我只能有效地使用全局状态。这是一个常见问题，这就是为什么许多 C API 提供类似于

int Test(int(*fn1)(double a, void *state), void *state);

然而，令我惊讶的是，我注意到从 C# 程序调用函数的第一个版本时这不是问题。

[UnmanagedFunctionPointer(CallingConvention.Cdecl)]
delegate int CallbackType(double something);

[DllImport("TestLib.dll", CallingConvention = CallingConvention.Cdecl)]
extern static int Test(CallbackType fn);

当调用回调函数时（即在 C# 代码中），this 指针及其所有成员都会被保留（这自动意味着可以使闭包和多播委托等附加功能轻松工作）。

我不明白编组器如何将 2 个指针的信息压缩到 1 个中。我做了很多测试，并在使用 TestCallback 的不同实例调用 C 函数时意识到了这一点（即 C# 中的不同调用目标），每次到达C的函数指针都是不同的地址。更准确地说，TestCallback 实例和唯一的 C 函数指针地址之间似乎存在直接的 1:1 映射，并且该地址似乎是持久的 - 但是我无法找到该地址在 @987654327 中的存储位置@实例。

我的结论是，在TestCallback 的实例化时，即在程序运行期间，CLR 必须将一个可执行的本机代码块发送到 RAM 中。该代码块使用硬编码的状态对象指针调用调度程序函数（状态对象可能是发出黑色代码的特定 TestCallback 实例）。

但是，到目前为止，我没有发现任何可以证实或反驳这一点的东西 - 要么没有关于这个主题的任何深入信息，要么它被肤浅的教程所掩盖。

如果这是真的，那如何在程序存储器和数据存储器严格分离的架构上工作，这样 CPU 就无法从数据存储器加载运行时生成的代码？它在要求提前编译的平台上如何工作？以及如何在 C 或 C++ 等较低级别的语言中实现类似的东西？

我用于测试的一些附加代码：

C header file ===================

extern __declspec(dllexport) int Test(
    int(*fn1)(double a), int *address1,
    int(*fn2)(double a), int *address2,
    int(*fn3)(double a), int *address3
);

C file ===================

int Test(
    int(*fn1)(double a), int *address1,
    int(*fn2)(double a), int *address2,
    int(*fn3)(double a), int *address3
)
{
    *address1 = (int)fn1;
    *address2 = (int)fn2;
    *address3 = (int)fn3;
    int result = fn1(5538867.0);
    result += 9;
    return result;
}

C# file ===================

class Program
{
    static void Main(string[] args)
    {
        var sc1 = new SomeClass(5538867);
        var sc2 = new SomeClass(-999999);
        var callback1 = new CallbackType(sc1.TheCallback);
        var callback2 = new CallbackType(sc2.TheCallback);
        int called_address1 = 0, called_address2 = 0, called_address3 = 0;

        var result = Test(
            callback1, ref called_address1,
            callback2, ref called_address2,
            callback2, ref called_address3
            );
        // should be 9 or 8
        Console.WriteLine(result);
        Console.WriteLine(called_address1.ToString("x8"));
        Console.WriteLine(called_address2.ToString("x8"));
        Console.WriteLine(called_address3.ToString("x8"));

        result = Test(
            callback1, ref called_address1,
            callback2, ref called_address2,
            callback2, ref called_address3
            );

        Console.WriteLine(result);
        Console.WriteLine(called_address1.ToString("x8"));
        Console.WriteLine(called_address2.ToString("x8"));
        Console.WriteLine(called_address3.ToString("x8"));

        GC.KeepAlive(callback1);
        GC.KeepAlive(callback2);

        Console.ReadKey();
    }

    class SomeClass
    {
        public SomeClass(int i)
        {
            this.i = i;
            this.something_else = "sjdklfjksdf";
        }

        private readonly int i;
        private readonly string something_else;

        public int TheCallback(double something)
        {
            return (int)something - this.i + this.something_else.Length - 11;
        }
    }

    [UnmanagedFunctionPointer(CallingConvention.Cdecl)]
    delegate int CallbackType(double something);

    [DllImport("TestLib.dll", CallingConvention = CallingConvention.Cdecl)]
    extern static int Test(
        CallbackType callback1, ref int called_address1,
        CallbackType callback2, ref int called_address2,
        CallbackType callback3, ref int called_address3
        );
}

【问题讨论】：

such that the CPU cannot load runtime generated code from the data memory? 你心目中的 CPU 是什么？除了另一个微型 uC 上的 AVR。我实际上无法回答你的问题，但这句话让我很好奇。
我从某些类型的微控制器中知道这一点，但是 afaik 可能还有其他类似的架构。 CPU 只能从特定的内存页面（通常是只读闪存）加载指令，而数据总线只能访问不同的部分。诚然，这与 C#/CLR 委托不再有任何关系，但这会使我的理论技术在纯 C 中变得不可能。
The CPU can only load instructions from particular memory pages (which is then usually read-only flash memory), and the data bus can only access a different section. 只有哈佛架构的——AVR 和 PIC 以及现在一些专门的 DSP。但是无论如何他们都没有.NET

标签： c# c delegates function-pointers

【解决方案1】：

我已经找到了部分答案。

我的结论是，在程序运行期间的 TestCallback 实例化时，CLR 必须将可执行的本机代码块发送到 RAM 中。该代码块使用硬编码的状态对象指针调用调度程序函数（状态对象可能是为其发出黑色代码的特定 TestCallback 实例）。

这是正确的；当一个委托被实例化时，可能会生成一个Thunk。它包含将上下文数据加载为常量的指令，然后跳转到实际的目标函数。当目标函数返回时，它不会返回到 thunk（因为它不需要返回那里），而是直接返回到原始调用者。

这怎么可能在程序存储器和数据存储器严格分离的架构上工作，以致 CPU 无法从数据存储器加载运行时生成的代码？

如果目标平台没有可写和可执行的内存，它就不起作用。例如，这会影响指令指针只能指向 ROM 的非可编程控制器。

它在要求提前编译的平台上如何工作？

它可能不起作用，至少不是在所有平台上。例如，在 Mono 上，它需要 MonoPInvokeCallbackAttribute，并且只能使用静态方法作为回调，请参阅 this MSDN article on Xamarin on iOS。

以及如何在 C 或 C++ 等较低级别的语言中实现类似的功能？

它需要手动将所需的 X86（或任何目标平台）加载和跳转指令以字节的形式写入内存。有库可以做到这一点，并且必须知道目标平台；它没有编译器支持。

（注意：我本可以发誓 Hans Passant 之前曾发布过这个问题的答案）

【讨论】：