使用 cython 对 python 中的小数组进行有效的数学运算答案

【问题标题】：Efficient math ops on small arrays in python with cython使用 cython 对 python 中的小数组进行有效的数学运算
【发布时间】：2011-07-18 13:55:38
【问题描述】：

我使用 numexpr 对大型数组进行快速数学运算，但如果数组的大小小于 CPU 缓存，则使用简单的数组数学在 Cython 中编写我的代码要快得多，尤其是在多次调用该函数的情况下。

问题是，您如何在 Cython 中使用数组，或者更明确地说：Cython 中是否有 Python 的 array.array 类型的直接接口？我想做的是这样的（简单的例子）

cpdef array[double] running_sum(array[double] arr):
    cdef int i 
    cdef int n = len(arr)
    cdef array[double] out = new_array_zeros(1.0, n)
    ... # some error checks
    out[0] = arr[0]
    for i in xrange(1,n-1):
        out[i] = out[i-1] + arr[i]

    return(out)

我首先尝试使用 Cython numpy 包装器并使用 ndarray，但与使用 malloc 创建 C 数组相比，创建它们似乎对于小型一维数组非常昂贵（但内存处理变得很痛苦）。

谢谢！

【问题讨论】：

你真的应该把它分成两个单独的问题，因为每个部分都是不同的。这将确保答案清楚地与特定问题相关联，并将提高将来参考此问题及其答案的用户的可读性。
谢谢。将问题分成两部分 - 第二部分：stackoverflow.com/questions/5359880/…
是的，numpy new-array 比 malloc 慢；但是你真的需要创建/删除很多东西吗，你不能在开始时创建一次numpy数组并重用它们吗？此外，上述与 np.cumsum 的 timeits 可能有用（什么是“小”——10、100？）
为了后代，我回答了一个非常相似的问题here。最近的 cython (0.17+) 有很多 good features 用于处理数组和 numpy.ndarrays 以及支持缓冲区接口的所有其他内容。

标签： python arrays performance numpy cython

【解决方案1】：

您可以使用基本功能滚动您自己的简单功能，并检查这里是一个样机开始：

from libc.stdlib cimport malloc,free

cpdef class SimpleArray:
    cdef double * handle
    cdef public int length
    def __init__(SimpleArray self, int n):
        self.handle = <double*>malloc(n * sizeof(double))
        self.length = n
    def __getitem__(self, int idx):
        if idx < self.length:
            return self.handle[idx]
        raise ValueError("Invalid Idx")
    def __dealloc__(SimpleArray self):
        free(self.handle) 

cpdef SimpleArray running_sum(SimpleArray arr):
    cdef int i 
    cdef SimpleArray out = SimpleArray(arr.length)

    out.handle[0] = arr.handle[0]
    for i from 1 < i < arr.length-1:
        out.handle[i] = out.handle[i-1] + arr.handle[i]
    return out

可以用作

>>> import test
>>> simple = test.SimpleArray(100)
>>> del simple
>>> test.running_sum(test.SimpleArray(100))
<test.SimpleArray object at 0x1002a90b0>

【讨论】：