使用 cython 早期类型类属性答案

【问题标题】：Using cython to early type class attributes使用 cython 早期类型类属性
【发布时间】：2014-01-15 07:56:28
【问题描述】：

我正在编写一个 python 类，我想使用 cython 早期输入来加速执行。
当我尝试 cython 编译以下内容时，我收到错误 "Syntax error in C variable declaration"：

import numpy as np
cimport numpy as np

class MyClass:
    def __init__( self, np.ndarray[double, ndim=1] Redges ):
        self.Redges = Redges
        cdef double self.var1

错误涉及涉及self.var1 的最后一行的语法。我不能直接输入类属性吗？我是否总是必须将其分解为两个步骤，例如，

cdef double var1
self.var1 = var1

完整的错误回溯是，

test.pyx:7:24:  
Syntax error in C variable declaration  
Traceback (most recent call last):  
File "setup.py", line 9, in <module>  
        ext_modules = cythonize('test.pyx'), # accepts a glob pattern  
      File "/usr/lib/python2.7/dist-packages/Cython/Build/Dependencies.py", line 713, in cythonize
        cythonize_one(*args[1:])  
      File "/usr/lib/python2.7/dist-packages/Cython/Build/Dependencies.py", line 780, in cythonize_one  
        raise CompileError(None, pyx_file)  
  Cython.Compiler.Errors.CompileError: calc_iliev_sphere.pyx

【问题讨论】：

这不是完整的回溯。

标签： python class cython

【解决方案1】：

你想要的是定义一个extension type。特别是您的代码应如下所示：

import numpy as np
cimport numpy as np

cdef class MyClass:
    cdef double var1
    cdef np.ndarray[double, ndim=1] Redges

    def __init__( self, np.ndarray[double, ndim=1] Redges ):
        self.Redges = Redges

请注意，您不能在普通的class 中强加实例属性的类型，因为 python 允许人们更改它们及其类型。如果您尝试将 cdef 放在普通 python 类中的类级别，您将收到 Cython 的编译器错误。

编译上述代码会引发以下错误：

Error compiling Cython file:
------------------------------------------------------------                       
...                                                                                
import numpy as np                                                                 
cimport numpy as np                                                                

cdef class MyClass:                                                                
    cdef double var1                                                               
    cdef np.ndarray[double, ndim=1] Redges                                         
                                   ^                                               
------------------------------------------------------------                       

test_cython.pyx:6:36: Buffer types only allowed as function local variables

现在，这不是语法错误。语法很好。问题是您只是不能拥有一个以np.ndarray 为类型的实例属性。这是 cython 的限制。事实上，如果您注释 cdef np.ndarray[double, ndim=1] Redges 行，则文件编译正确：

代码：

import numpy as np
cimport numpy as np

cdef class MyClass:
    cdef double var1
    #cdef np.ndarray[double, ndim=1] Redges

    def __init__( self, np.ndarray[double, ndim=1] Redges ):
        self.Redges = Redges

输出：

$cython test_cython.pyx 
$

注意：cython 没有输出，表示文件编译成功。

我在上面链接的文档Attributes 部分解释了这个限制：

扩展类型的属性直接存储在对象的 C struct。 [省略]

注意：您只能公开简单的 C 类型，例如整数、浮点数和字符串，以供 Python 访问。您还可以公开 Python 值属性。

您只能公开简单 C 数据类型的事实是因为属性是struct 的成员。允许像 np.ndarray 这样的缓冲区需要具有可变大小 structs。

如果你想要一个np.ndarray 类型的实例属性，最好的办法是定义一个泛型类型为object 的属性并将数组分配给它：

import numpy as np
cimport numpy as np

cdef class MyClass:
    cdef double var1
    cdef object Redges

    def __init__( self, np.ndarray[double, ndim=1] Redges ):
        self.Redges = Redges

但是，现在每次访问 self.Redges 时，都会失去 cython 的速度。如果您多次访问它，您可以将其分配给具有正确类型的局部变量。这就是我的意思：

import numpy as np
cimport numpy as np

cdef class MyClass:
    cdef double var1
    cdef object Redges

    def __init__( self, np.ndarray[double, ndim=1] Redges ):
        self.Redges = Redges

    def do_stuff(self):
        cdef np.ndarray[double, ndim=1] ar
        ar = self.Redges
        ar[0] += 1
        return ar[0]

通过这种方式在do_stuff 函数中，您可以使用ar 获得cython 的所有速度。

【讨论】：

该解决方案产生另一条错误消息，我已将其添加到帖子中。
值得注意的是memoryview语法（double[:]）不需要在object和它的真实类型之间不断转换，所以应该更快。
这是一个令人困惑的答案，它与具有可变长度的数组没有任何关系。我可以拥有自己的 cpp 类（在堆栈上创建）但没有缓冲区，但这不是原因，虽然不知道它是什么 :)
我还没有收到关于编译错误的Cython question 回复。您或许可以提供帮助。
引用的 Cython 参考似乎来自older Pyrex documentation：“请注意，您只能公开简单的 C 类型，例如整数、浮点数和字符串，以供 Python 访问。您还可以公开 Python-值属性，尽管读写暴露仅适用于通用 Python 属性（object 类型）。如果属性声明为扩展类型，则必须以只读方式暴露。”

【解决方案2】：

@bakuriu 的回答很好，我想补充一下如何将内存视图保留为类成员：

import numpy as np
cimport numpy as np

cdef class MyClass:
    cdef public double var1
    cdef public np.float64_t[:] Redges

    def __init__( self, np.ndarray[double, ndim=1] Redges ):
        self.Redges = Redges

使用这种方法do_stuff 变得更简单：

def do_stuff(self):
    # With using buffer protocol, this just wraps the memory view
    # with numpy object without copying data
    np_redges = np.asarray(self.Redges)

    # Now you have np_redges, a numpy object. Even though, it's not a pure 
    # C array, it allows calling numpy functions with all the power of MKL, e.g.:
    np.add(np_redges, 1.0, np_redges)

【讨论】：