将 C++ std::Vector 传递给 Python 中的 numpy 数组答案

【问题标题】：Passing a C++ std::Vector to numpy array in Python将 C++ std::Vector 传递给 Python 中的 numpy 数组
【发布时间】：2013-09-13 07:23:34
【问题描述】：

我正在尝试将我在C++ 代码中生成的双精度向量传递给python numpy 数组。我希望在Python 中进行一些下游处理，并希望在填充 numpy 数组后使用一些 python 工具。我想做的最重要的事情之一就是能够绘制事物，而 C++ 在这方面有点笨拙。我还希望能够利用 Python 的统计能力。

虽然我不太清楚该怎么做。我花了很多时间浏览 Python C API 文档。我遇到了一个显然可以解决问题的函数 PyArray_SimpleNewFromData。就代码的整体设置而言，我仍然不清楚。我正在构建一些非常简单的测试用例来帮助我理解这个过程。我在 Visual Studio express 2012 中生成了以下代码作为独立的 Empty 项目。我将此文件称为 Project1

#include <Python.h>
#include "C:/Python27/Lib/site-packages/numpy/core/include/numpy/arrayobject.h"

PyObject * testCreatArray()
{
    float fArray[5] = {0,1,2,3,4};
    npy_intp m = 5;
    PyObject * c = PyArray_SimpleNewFromData(1,&m,PyArray_FLOAT,fArray);
    return c; 
}

我的目标是能够在 Python 中读取 PyObject。我被卡住了，因为我不知道如何在 Python 中引用这个模块。特别是如何从 Python 导入此项目，我尝试从 python 中的项目路径导入 Project1，但失败了。一旦我理解了这个基本情况，我的目标就是想办法将我在 main 函数中计算的向量容器传递给 Python。我也不知道该怎么做。

任何可以帮助我的专家，或者发布一个包含一些代码的简单示例，该示例从一个简单的 c++ 向量读取并填充一个 numpy 数组，我将不胜感激。非常感谢。

【问题讨论】：

它并不太复杂，因为它主要需要大量样板代码。您需要在 C 中定义一个新的 Python 模块，并向它注册一个新的 Python 方法，同样在 C 中，它将调用您的上述函数。 Python 文档中的 Extending Python with C or C++ 指南是一个很好的起点。
非常感谢，如果您能发布一个简单的自包含示例，我将不胜感激。我会帮我很多。谢谢。
我找到了this related question that may help you out，但最直接的方法可能是使用 Cython...

标签： c++ arrays vector numpy

【解决方案1】：

我不是 cpp-hero ，但想为我的解决方案提供 1D 和 2D 向量的两个模板函数。这是使用 l8ter 的单行代码，通过模板化 1D 和 2D 向量，编译器可以为您的向量形状采用正确的版本。在 2D 的情况下，如果形状不规则，则抛出一个字符串。 例程在此处复制数据，但可以轻松修改它以获取输入向量的第一个元素的地址，以使其只是一个“表示”。

用法如下：

// Random data
vector<float> some_vector_1D(3,1.f); // 3 entries set to 1
vector< vector<float> > some_vector_2D(3,vector<float>(3,1.f)); // 3 subvectors with 1

// Convert vectors to numpy arrays
PyObject* np_vec_1D = (PyObject*) vector_to_nparray(some_vector_1D);
PyObject* np_vec_2D = (PyObject*) vector_to_nparray(some_vector_2D);

您还可以通过可选参数更改 numpy 数组的类型。模板函数有：

/** Convert a c++ 2D vector into a numpy array
 *
 * @param const vector< vector<T> >& vec : 2D vector data
 * @return PyArrayObject* array : converted numpy array
 *
 * Transforms an arbitrary 2D C++ vector into a numpy array. Throws in case of
 * unregular shape. The array may contain empty columns or something else, as
 * long as it's shape is square.
 *
 * Warning this routine makes a copy of the memory!
 */
template<typename T>
static PyArrayObject* vector_to_nparray(const vector< vector<T> >& vec, int type_num = PyArray_FLOAT){

   // rows not empty
   if( !vec.empty() ){

      // column not empty
      if( !vec[0].empty() ){

        size_t nRows = vec.size();
        size_t nCols = vec[0].size();
        npy_intp dims[2] = {nRows, nCols};
        PyArrayObject* vec_array = (PyArrayObject *) PyArray_SimpleNew(2, dims, type_num);

        T *vec_array_pointer = (T*) PyArray_DATA(vec_array);

        // copy vector line by line ... maybe could be done at one
        for (size_t iRow=0; iRow < vec.size(); ++iRow){

          if( vec[iRow].size() != nCols){
             Py_DECREF(vec_array); // delete
             throw(string("Can not convert vector<vector<T>> to np.array, since c++ matrix shape is not uniform."));
          }

          copy(vec[iRow].begin(),vec[iRow].end(),vec_array_pointer+iRow*nCols);
        }

        return vec_array;

     // Empty columns
     } else {
        npy_intp dims[2] = {vec.size(), 0};
        return (PyArrayObject*) PyArray_ZEROS(2, dims, PyArray_FLOAT, 0);
     }


   // no data at all
   } else {
      npy_intp dims[2] = {0, 0};
      return (PyArrayObject*) PyArray_ZEROS(2, dims, PyArray_FLOAT, 0);
   }

}


/** Convert a c++ vector into a numpy array
 *
 * @param const vector<T>& vec : 1D vector data
 * @return PyArrayObject* array : converted numpy array
 *
 * Transforms an arbitrary C++ vector into a numpy array. Throws in case of
 * unregular shape. The array may contain empty columns or something else, as
 * long as it's shape is square.
 *
 * Warning this routine makes a copy of the memory!
 */
template<typename T>
static PyArrayObject* vector_to_nparray(const vector<T>& vec, int type_num = PyArray_FLOAT){

   // rows not empty
   if( !vec.empty() ){

       size_t nRows = vec.size();
       npy_intp dims[1] = {nRows};

       PyArrayObject* vec_array = (PyArrayObject *) PyArray_SimpleNew(1, dims, type_num);
       T *vec_array_pointer = (T*) PyArray_DATA(vec_array);

       copy(vec.begin(),vec.end(),vec_array_pointer);
       return vec_array;

   // no data at all
   } else {
      npy_intp dims[1] = {0};
      return (PyArrayObject*) PyArray_ZEROS(1, dims, PyArray_FLOAT, 0);
   }

}

【讨论】：

你是我的 cpp 英雄 xxx
由于 numpy 不支持锯齿状数组，如何将锯齿状二维数组/向量的列表转换为 python 列表？

【解决方案2】：

由于没有对可能正在寻找这类事情的人真正有帮助的答案，我想我会提出一个简单的解决方案。

首先，您需要在 C++ 中创建一个 python 扩展模块，这很容易做到，并且全部在 python c-api 文档中，所以我不打算讨论。

现在将 c++ std::vector 转换为 numpy 数组非常简单。首先需要导入numpy数组头

#include <numpy/arrayobject.h>

在你的初始化函数中你需要 import_array()

PyModINIT_FUNC
inittestFunction(void){
   (void) Py_InitModule("testFunction". testFunctionMethods);
   import_array();
}

现在您可以使用提供的 numpy 数组函数。你想要的就是几年前 OP 所说的 PyArray_SimpleNewFromData，它使用起来非常简单。您只需要一个 npy_intp 类型的数组，这是要创建的数组的形状。确保它与使用 testVector.size() 的向量相同，（并且对于多个维度，请执行 testVector[0].size()、testVector[0][0].size() 等。向量保证是连续的在 c++11 中，除非它是布尔值）。

//create testVector with data initialised to 0
std::vector<std::vector<uint16_t>> testVector;
testVector.resize(width, std::vector<uint16_t>(height, 0);
//create shape for numpy array
npy_intp dims[2] = {width, height}
//convert testVector to a numpy array
PyArrayObject* numpyArray = (PyArrayObject*)PyArray_SimpleNewFromData(2, dims, NPY_UINT16, (uint16_t*)testVector.data());

通过参数。首先，您需要将其转换为 PyArrayObject，否则它将是 PyObject，并且返回给 python 时不会是 numpy 数组。 2，是数组中的维数。 dims，是数组的形状。这必须是 npy_intp 类型 NPY_UINT16 是数组在 python 中的数据类型。然后使用 testVector.data() 获取数组的数据，将其转换为 void* 或与向量相同数据类型的指针。

希望这对可能需要此功能的其他人有所帮助。

（另外，如果您不需要纯粹的速度，我建议您避免使用 C-API，它会导致很多问题，而 cython 或 swig 仍然可能是您的最佳选择。还有一些 c 类型会很有帮助.

【讨论】：

您好，我根据您提供的内容编写了一个简单的测试演示。返回给python的numpyArray的值不等于我分配的值。根据python中的返回值，我怀疑数据转换是否存在错误。我检查并尝试了不同类型的数据，但它也遇到了同样的问题。能否请您对此提出一些建议？

【解决方案3】：

我在尝试做非常相似的事情时遇到了你的帖子。我能够拼凑出一个解决方案，整个解决方案是on my Github。它生成两个 C++ 向量，将它们转换为 Python 元组，将它们传递给 Python，将它们转换为 NumPy 数组，然后使用 Matplotlib 绘制它们。

大部分代码来自 Python 文档。

以下是 .cpp 文件中的一些重要部分：

 //Make some vectors containing the data
 static const double xarr[] = {1,2,3,4,5,6,7,8,9,10,11,12,13,14};
 std::vector<double> xvec (xarr, xarr + sizeof(xarr) / sizeof(xarr[0]) );
 static const double yarr[] = {0,0,1,1,0,0,2,2,0,0,1,1,0,0};
 std::vector<double> yvec (yarr, yarr + sizeof(yarr) / sizeof(yarr[0]) );

 //Transfer the C++ vector to a python tuple
 pXVec = PyTuple_New(xvec.size()); 
 for (i = 0; i < xvec.size(); ++i) {
      pValue = PyFloat_FromDouble(xvec[i]);
      if (!pValue) {
           Py_DECREF(pXVec);
           Py_DECREF(pModule);
           fprintf(stderr, "Cannot convert array value\n");
           return 1;
      }
      PyTuple_SetItem(pXVec, i, pValue);
 }

 //Transfer the other C++ vector to a python tuple
 pYVec = PyTuple_New(yvec.size()); 
 for (i = 0; i < yvec.size(); ++i) {
      pValue = PyFloat_FromDouble(yvec[i]);
      if (!pValue) {
           Py_DECREF(pYVec);
           Py_DECREF(pModule);
           fprintf(stderr, "Cannot convert array value\n");
           return 1;
      }
      PyTuple_SetItem(pYVec, i, pValue); //
 }

 //Set the argument tuple to contain the two input tuples
 PyTuple_SetItem(pArgTuple, 0, pXVec);
 PyTuple_SetItem(pArgTuple, 1, pYVec);

 //Call the python function
 pValue = PyObject_CallObject(pFunc, pArgTuple);

还有 Python 代码：

def plotStdVectors(x, y):
    import numpy as np
    import matplotlib.pyplot as plt
    print "Printing from Python in plotStdVectors()"
    print x
    print y
    x = np.fromiter(x, dtype = np.float)
    y = np.fromiter(y, dtype = np.float)
    print x
    print y
    plt.plot(x, y)
    plt.show()
    return 0

这导致由于我的声誉而无法在此处发布的情节，而是posted on my blog post here。

【讨论】：

你真的应该考虑使用PyArray_SimpleNewFromData，就像问题中提出的OP一样。对于较大的向量，避免创建 Python 列表或元组非常重要，因为它们要求所有元素都是 Python 对象，这对于大型向量来说内存和 CPU 效率非常低。
感谢您的建议；我会调查的。
@user4815162342 如果您有更好的解决方案，最好将其发布为答案。

【解决方案4】：

  _import_array(); //this is required for numpy to create an array correctly

注意：在 Numpy 的扩展指南中，他们使用 import_array() 来实现与我使用 _import_array() 相同的目标。当我尝试使用 import_array() 时，在 Mac 上出现错误。因此，您可能需要同时尝试这两个命令，看看哪一个有效。

顺便说一句，您可以在调用PyArray_SimpleNewFromData 时使用C++ std::vector。如果您的 std::vector 是 my_vector，请将 fArray 替换为 &my_vector[0]。 &my_vector[0] 允许您访问将数据存储在my_vector 中的指针。

【讨论】：

您为什么得出结论，目标是嵌入 Python 解释器？在我看来，目标是用 C++ 模块扩展 Python。