有没有办法在 C 中进行柯里化？答案

【问题标题】：Is there a way to do currying in C?有没有办法在 C 中进行柯里化？
【发布时间】：2014-12-30 10:55:43
【问题描述】：

假设我有一个指向函数_stack_push(stack* stk, void* el) 的指针。我希望能够调用curry(_stack_push, my_stack) 并取回一个只需要void* el 的函数。我想不出办法，因为 C 不允许运行时函数定义，但我知道这里有比我聪明得多的人 :)。有什么想法吗？

【问题讨论】：

标签： c functional-programming currying

【解决方案1】：

因为 C 不允许定义运行时函数

这在standard C 中原则上是正确的。阅读n1570了解详情。

然而，在实践中它可能是错误的。考虑

在 POSIX 系统（例如 Linux）上，在运行时在一些临时文件 /tmp/generated1234.c 文件中生成一些 C 代码，该文件定义了一些 void genfoo1234(void) 函数，编译该文件（例如，使用最近的 GCC 编译器作为 gcc -O -fPIC -Wall -shared /tmp/generated1234.c -o /tmp/generated1234.so）然后在/tmp/generated1234.so 上使用dlopen(3) 然后在genfoo1234 上使用dlsym(3) 在dlopen 返回的句柄上以获取函数指针）。根据个人经验，这种方法在今天（2021 年，在 Linux 笔记本电脑上）已经足够快，即使是交互式使用（如果每个临时生成的 C 文件都有几百行 C 代码）。
在 x86、x86-64、ARM 处理器上使用一些机器代码生成库，如 GNU lightning、libgccjit（或在 C++ 中，asmjit）

实际上，您将为closure 生成代码（将函数指针与封闭值分组）并将其用作callback。

相关的一点是垃圾回收，请阅读garbage collection handbook。

考虑在您的应用程序中嵌入一些现有的解释器，例如Lua、GNU guile、Python 等......

研究一下这些解释器的源代码，至少是为了获得灵感。

Quenniec 的书Lisp in small pieces 和Dragon book 值得一读。两者都解释了实际问题和实施细节

另请参阅最近的 GCC 编译器（2021 年）中的 __builtin_call_with_static_chain。

【讨论】：

【解决方案2】：

好消息：有一种方法可以在标准 ANSI C 中编写程序，而无需使用任何编译器特定的功能。（特别是，它不需要需要gcc的nested function support。）

坏消息：它需要创建一小段可执行代码来在运行时充当蹦床函数。这意味着实现将取决于：

处理器指令集
ABI（特别是函数调用约定）
操作系统将数据标记为可执行文件的能力

最好的消息： 如果您只需要在真实的生产代码中执行此操作……您应该使用the closure API of libffi。它已获得许可，并包含针对 many platforms and ABIs 的谨慎、灵活的实现。

如果你还在这里，你想成为书呆子并了解如何“从头开始”实现这一点。

下面的程序演示了如何将 2 参数函数 curry 到 C 中的一个单参数函数，给定……

x86-64 处理器架构
System V ABI
Linux 操作系统

它是基于“蹦床插图”从 Infectious Executable Stacks，但蹦床结构存储在堆（通过malloc）而不是堆栈上。这更安全，因为这意味着我们不必禁用编译器的堆栈执行保护（没有gcc -Wl,-z,execstack）。

它使用the Linux mprotect system call 使堆对象可执行。

该程序的本质是它接受一个指向双参数函数 (uint32_t (*fp2)(uint32_t a, uint32_t b)) 的指针并将其转换为指向单参数函数 (uint32_t (*fp1)(uint32_t a)) 的指针，该函数调用 fp1 -设置参数b的值。它通过创建小的 3 指令蹦床函数来做到这一点：

movl $imm32, %esi  /* $imm32 filled in with the value of 'b' */
movq %imm64, %rax  /* $imm64 filled in with the value of 'fp2' */
jmpq *%rax

将b 和fp2 的值适当地拼接到其中后，可以将指向包含这3 条指令的内存块的指针用作单参数函数指针fp1，正如上文所述。这是因为它遵循the x86-64 System V calling convention，其中单参数函数在%edi/%rdi 寄存器中接收它们的第一个参数，而双参数函数在%esi/%rsi 寄存器中接收第二个参数。这种情况下，单参数trampoline函数在%edi中接收到它的uint32_t参数，然后在%esi中填入第二个uint32_t参数的值，然后直接跳转到“真正的”双参数函数它期望它的两个参数恰好在那些寄存器中。

这是完整的工作代码，我也在 GitHub 上 dlenski/c-curry:

#include <stdio.h>
#include <string.h>
#include <stdlib.h>

#include <unistd.h>
#include <sys/mman.h>
#include <stdint.h>

#define PAGE_START(P) ((uintptr_t)(P) & ~(pagesize-1))
#define PAGE_END(P)   (((uintptr_t)(P) + pagesize - 1) & ~(pagesize-1))

/* x86-64 ABI passes parameters in rdi, rsi, rdx, rcx, r8, r9
 * (https://wiki.osdev.org/System_V_ABI), and return value
 * goes in %rax.
 *
 * Binary format of useful opcodes:
 *
 *       0xbf, [le32] = movl $imm32, %edi (1st param)
 *       0xbe, [le32] = movl $imm32, %esi (2nd param)
 *       0xba, [le32] = movl $imm32, %edx (3rd param)
 *       0xb9, [le32] = movl $imm32, %ecx (4rd param)
 *       0xb8, [le32] = movl $imm32, %eax
 * 0x48, 0x__, [le64] = movq $imm64, %r__
 *       0xff, 0xe0   = jmpq *%rax
 */

typedef uint32_t (*one_param_func_ptr)(uint32_t);
one_param_func_ptr curry_two_param_func(
    void *two_param_func, 
    uint32_t second_param)
{
    /* This is a template for calling a "curried" version of 
     * uint32_t (*two_param_func)(uint32_t a, uint32_t b),
     * using the Linux x86-64 ABI. The curried version can be
     * treated as uint32_t (*one_param_func)(uint32_t a).
     */
    uintptr_t fp = (uintptr_t)two_param_func;
    uint8_t template[] = {
        0xbe, 0, 0, 0, 0,                                   /* movl $imm32, %esi */
        0x48, 0xb8, fp >>  0, fp >>  8, fp >> 16, fp >> 24, /* movq fp, %rax */
                    fp >> 32, fp >> 40, fp >> 48, fp >> 56,
        0xff, 0xe0                                          /* jmpq *%rax */
    };
    
    /* Now we create a copy of this template on the HEAP, and
     * fill in the second param. */
    uint8_t *buf = malloc(sizeof(template));
    if (!buf)
        return NULL;
    
    memcpy(buf, template, sizeof(template));
    buf[1] = second_param >> 0;
    buf[2] = second_param >> 8;
    buf[3] = second_param >> 16;
    buf[4] = second_param >> 24;
    
    /* We do NOT want to make the stack executable,
     * but we NEED the heap-allocated buf to be executable.
     * Compiling with 'gcc -Wl,-z,execstack' would do BOTH.
     *
     * This appears to be the right way to only make a heap object executable:
     *   https://stackoverflow.com/a/23277109/20789
     */
    uintptr_t pagesize = sysconf(_SC_PAGE_SIZE);
    mprotect((void *)PAGE_START(buf),
             PAGE_END(buf + sizeof(template)) - PAGE_START(buf),
             PROT_READ|PROT_WRITE|PROT_EXEC);
    
    return (one_param_func_ptr)buf;
}

/********************************************/

int print_both_params(int a, int b)
{
    printf("Called with a=%d, b=%d\n", a, b);
    return a+b;
}

int main(int argc, char **argv)
{
    one_param_func_ptr print_both_params_b4 =
        curry_two_param_func(print_both_params, 4);
    one_param_func_ptr print_both_params_b256 = 
        curry_two_param_func(print_both_params, 256);
    
    print_both_params_b4(3);    // "Called with a=3, b=4"
    print_both_params_b256(6);  // "Called with a=6, b=256"

    return 0;
}

【讨论】：

非常好，出于好奇，您构建此功能时的动机是什么？
我试图弄清楚是否有一种方法可以将任意用户数据传递到 API 不对此提供支持的回调函数中（一个重大的设计缺陷）。我进了一个兔子洞，最后写了这个。原来是libffi has an API for this，它适用于many platforms and APIs。尽管如此，弄清楚和探索这件事还是很有趣的……

【解决方案3】：

这是一种在 C 中进行柯里化的方法。虽然此示例应用程序使用 C++ iostream 输出为方便起见，但它都是 C 风格的编码。

这种方法的关键是要有一个struct，它包含一个unsigned char 的数组，这个数组用于为一个函数建立一个参数列表。要调用的函数被指定为推入数组的参数之一。然后将生成的数组提供给代理函数，该函数实际执行函数和参数的闭包。

在这个例子中，我提供了几个类型特定的辅助函数来将参数推送到闭包中，以及一个通用的 pushMem() 函数来推送 struct 或其他内存区域。

这种方法确实需要分配一个内存区域，然后用于闭包数据。最好将堆栈用于该内存区域，这样内存管理就不会成为问题。还有一个问题是要使闭包存储内存区域有多大，以便有足够的空间容纳必要的参数，但又不能太大，以至于内存或堆栈中的多余空间被未使用的空间占用。

我已经尝试过使用定义略有不同的闭包结构，它包含一个附加字段，用于存储用于存储闭包数据的数组的当前使用大小。然后，这种不同的闭包结构与修改后的辅助函数一起使用，从而消除了辅助函数的用户在向闭包结构添加参数时维护自己的 unsigned char * 指针的需要。

注意事项和注意事项

以下示例程序是使用 Visual Studio 2013 编译和测试的。下面提供了此示例的输出。我不确定在此示例中使用 GCC 或 CLANG，也不确定 64 位编译器可能会出现的问题，因为我的印象是我的测试是使用 32 位应用程序。此外，这种方法似乎只适用于使用标准 C 声明的函数，其中调用函数在被调用者返回后处理从堆栈中弹出参数（__cdecl 而不是 Windows API 中的__stdcall）。

由于我们在运行时构建参数列表，然后调用代理函数，这种方法不允许编译器对参数执行检查。由于编译器无法标记的参数类型不匹配，这可能会导致神秘的失败。

示例应用程序

// currytest.cpp : Defines the entry point for the console application.
//
// while this is C++ usng the standard C++ I/O it is written in
// a C style so as to demonstrate use of currying with C.
//
// this example shows implementing a closure with C function pointers
// along with arguments of various kinds. the closure is then used
// to provide a saved state which is used with other functions.

#include "stdafx.h"
#include <iostream>

// notation is used in the following defines
//   - tname is used to represent type name for a type
//   - cname is used to represent the closure type name that was defined
//   - fname is used to represent the function name

#define CLOSURE_MEM(tname,size) \
    typedef struct { \
        union { \
            void *p; \
            unsigned char args[size + sizeof(void *)]; \
        }; \
    } tname;

#define CLOSURE_ARGS(x,cname) *(cname *)(((x).args) + sizeof(void *))
#define CLOSURE_FTYPE(tname,m) ((tname((*)(...)))(m).p)

// define a call function that calls specified function, fname,
// that returns a value of type tname using the specified closure
// type of cname.
#define CLOSURE_FUNC(fname, tname, cname) \
    tname fname (cname m) \
    { \
        return ((tname((*)(...)))m.p)(CLOSURE_ARGS(m,cname)); \
    }

// helper functions that are used to build the closure.
unsigned char * pushPtr(unsigned char *pDest, void *ptr) {
    *(void * *)pDest = ptr;
    return pDest + sizeof(void *);
}

unsigned char * pushInt(unsigned char *pDest, int i) {
    *(int *)pDest = i;
    return pDest + sizeof(int);
}

unsigned char * pushFloat(unsigned char *pDest, float f) {
    *(float *)pDest = f;
    return pDest + sizeof(float);
}

unsigned char * pushMem(unsigned char *pDest, void *p, size_t nBytes) {
    memcpy(pDest, p, nBytes);
    return pDest + nBytes;
}


// test functions that show they are called and have arguments.
int func1(int i, int j) {
    std::cout << " func1 " << i << " " << j;
    return i + 2;
}

int func2(int i) {
    std::cout << " func2 " << i;
    return i + 3;
}

float func3(float f) {
    std::cout << " func3 " << f;
    return f + 2.0;
}

float func4(float f) {
    std::cout << " func4 " << f;
    return f + 3.0;
}

typedef struct {
    int i;
    char *xc;
} XStruct;

int func21(XStruct m) {
    std::cout << " fun21 " << m.i << " " << m.xc << ";";
    return m.i + 10;
}

int func22(XStruct *m) {
    std::cout << " fun22 " << m->i << " " << m->xc << ";";
    return m->i + 10;
}

void func33(int i, int j) {
    std::cout << " func33 " << i << " " << j;
}

// define my closure memory type along with the function(s) using it.

CLOSURE_MEM(XClosure2, 256)           // closure memory
CLOSURE_FUNC(doit, int, XClosure2)    // closure execution for return int
CLOSURE_FUNC(doitf, float, XClosure2) // closure execution for return float
CLOSURE_FUNC(doitv, void, XClosure2)  // closure execution for void

// a function that accepts a closure, adds additional arguments and
// then calls the function that is saved as part of the closure.
int doitargs(XClosure2 *m, unsigned char *x, int a1, int a2) {
    x = pushInt(x, a1);
    x = pushInt(x, a2);
    return CLOSURE_FTYPE(int, *m)(CLOSURE_ARGS(*m, XClosure2));
}

int _tmain(int argc, _TCHAR* argv[])
{
    int k = func2(func1(3, 23));
    std::cout << " main (" << __LINE__ << ") " << k << std::endl;

    XClosure2 myClosure;
    unsigned char *x;

    x = myClosure.args;
    x = pushPtr(x, func1);
    x = pushInt(x, 4);
    x = pushInt(x, 20);
    k = func2(doit(myClosure));
    std::cout << " main (" << __LINE__ << ") " << k << std::endl;

    x = myClosure.args;
    x = pushPtr(x, func1);
    x = pushInt(x, 4);
    pushInt(x, 24);               // call with second arg 24
    k = func2(doit(myClosure));   // first call with closure
    std::cout << " main (" << __LINE__ << ") " << k << std::endl;
    pushInt(x, 14);              // call with second arg now 14 not 24
    k = func2(doit(myClosure));  // second call with closure, different value
    std::cout << " main (" << __LINE__ << ") " << k << std::endl;

    k = func2(doitargs(&myClosure, x, 16, 0));  // second call with closure, different value
    std::cout << " main (" << __LINE__ << ") " << k << std::endl;

    // further explorations of other argument types

    XStruct xs;

    xs.i = 8;
    xs.xc = "take 1";
    x = myClosure.args;
    x = pushPtr(x, func21);
    x = pushMem(x, &xs, sizeof(xs));
    k = func2(doit(myClosure));
    std::cout << " main (" << __LINE__ << ") " << k << std::endl;

    xs.i = 11;
    xs.xc = "take 2";
    x = myClosure.args;
    x = pushPtr(x, func22);
    x = pushPtr(x, &xs);
    k = func2(doit(myClosure));
    std::cout << " main (" << __LINE__ << ") " << k << std::endl;

    x = myClosure.args;
    x = pushPtr(x, func3);
    x = pushFloat(x, 4.0);

    float dof = func4(doitf(myClosure));
    std::cout << " main (" << __LINE__ << ") " << dof << std::endl;

    x = myClosure.args;
    x = pushPtr(x, func33);
    x = pushInt(x, 6);
    x = pushInt(x, 26);
    doitv(myClosure);
    std::cout << " main (" << __LINE__ << ") " << std::endl;

    return 0;
}

测试输出

此示例程序的输出。括号中的数字是进行函数调用的 main 中的行号。

 func1 3 23 func2 5 main (118) 8
 func1 4 20 func2 6 main (128) 9
 func1 4 24 func2 6 main (135) 9
 func1 4 14 func2 6 main (138) 9
 func1 4 16 func2 6 main (141) 9
 fun21 8 take 1; func2 18 main (153) 21
 fun22 11 take 2; func2 21 main (161) 24
 func3 4 func4 6 main (168) 9
 func33 6 26 main (175)

【讨论】：

【解决方案4】：

GCC 为嵌套函数的定义提供了扩展。虽然这不是 ISO 标准 C，但这可能会引起一些兴趣，因为它可以很方便地回答这个问题。简而言之，嵌套函数可以访问父函数的局部变量，父函数也可以返回指向它们的指针。

这是一个简短的、不言自明的示例：

#include <stdio.h>

typedef int (*two_var_func) (int, int);
typedef int (*one_var_func) (int);

int add_int (int a, int b) {
    return a+b;
}

one_var_func partial (two_var_func f, int a) {
    int g (int b) {
        return f (a, b);
    }
    return g;
}

int main (void) {
    int a = 1;
    int b = 2;
    printf ("%d\n", add_int (a, b));
    printf ("%d\n", partial (add_int, a) (b));
}

然而，这种结构有一个限制。如果你保留一个指向结果函数的指针，如

one_var_func u = partial (add_int, a);

函数调用u(0) 可能会导致意外行为，因为u 读取的变量a 在partial 终止后被销毁。

见this section of GCC's documentation。

【讨论】：

来自手册（在您提供的链接下）：“如果您在包含函数退出后尝试通过其地址调用嵌套函数，所有地狱都会崩溃。”
如果您已经将自己限制在 GCC，您可以使用语句表达式将地狱推迟到调用函数退出（即：将适用于除异步回调之外的所有内容）：gist.github.com/a3f/2729c1248d0f2ee39b4a

【解决方案5】：

我发现 Laurent Dami 的一篇论文讨论了 C/C++/Objective-C 中的柯里化：

More Functional Reusability in C/C++/Objective-c with Curried Functions

对它如何在 C 中实现感兴趣：

我们当前的实现使用现有的 C 构造来添加柯里化机制。这比修改编译器要容易得多，并且足以证明柯里化的兴趣。然而，这种方法有两个缺点。首先，柯里化函数不能进行类型检查，因此需要小心使用以避免错误。其次，curry 函数不知道它的参数的大小，并且把它们当作整数大小来计算。

论文没有包含curry()的实现，但你可以想象它是如何使用function pointers和variadic functions实现的。

【讨论】：

+1 很棒的发现，我喜欢“虽然我们没有进行广泛的测试，但我们可以估计一个 curried 函数调用比普通函数调用慢大约 60 倍。”
（我喜欢它，因为有时您非常需要一些东西，而运行速度仅慢 60 倍的解决方案比根本没有解决方案要好得多。）
谢谢，我在维基百科页面“Currying”上找到了链接：en.wikipedia.org/wiki/Curried_function
当我阅读问题并向下滚动之前，我首先想到的是可变参数函数。很高兴看到我的预感是不错的。
遗憾的是，论文在 2021 年无法下载。

【解决方案6】：

这是我的第一个猜测（可能不是最好的解决方案）。

curry 函数可以在堆外分配一些内存，并将参数值放入堆分配的内存中。然后，诀窍是让返回的函数知道它应该从堆分配的内存中读取其参数。如果返回函数只有一个实例，那么指向这些参数的指针可以存储在单例/全局中。否则，如果返回函数的实例不止一个，那么我认为 curry 需要在堆分配的内存中创建返回函数的每个实例（通过编写诸如“获取指向参数”、“推送参数”和“调用其他函数”到堆分配的内存中）。在这种情况下，您需要注意分配的内存是否可执行，并且可能（我不知道）甚至害怕防病毒程序。

【讨论】：