如何将 std::vector<thrust::device_vector<int>> 转换为 int**？答案

【问题标题】：How do I convert a std::vector<thrust::device_vector<int>> to int**?如何将 std::vector<thrust::device_vector<int>> 转换为 int**？
【发布时间】：2021-09-03 07:31:51
【问题描述】：

我正在开发一个应用程序，在该应用程序中，先前的处理产生了一个（短但可变长度）std::vector 和（大）thrust::device_vectors，每个都具有相同的长度（但该长度也是可变的）。我需要将其转换为设备上的原始指针以将其传递给 cuda 内核。

我执行了下面的过程，据我所知，应该将rawNumberSquare 作为设备上的指针，rawNumberSquare[0] 和rawNumberSquare[1] 分别包含指向numberSquareOnDevice[0][0] 和numberSquareOnDevice[1][0] 的指针。所以，在我看来，rawNumberSquare[i][j] (i,j = 0,1) 都是该程序分配的所有位置，并且可以合法访问。

但是，当内核尝试访问这些位置时，值错误并且程序因非法内存访问而崩溃。

#include "cuda_runtime.h"
#include "device_launch_parameters.h"
#include <stdio.h>
#include<vector>
#include<thrust/device_vector.h>

__global__ void talkKernel(  int ** in,  int dimension)
{
    int index = threadIdx.x;
    for (int coord = 0; coord < dimension; ++coord)
        printf("in[%d][%d] = %d\n", coord, index, in[coord][index]);       
}

int main()
{
    //print out name of GPU in case it is helpful
    int deviceNumber;
    cudaGetDevice(&deviceNumber);
    cudaDeviceProp prop;
    cudaGetDeviceProperties(&prop, deviceNumber);
    std::cout << prop.name << "\n";
    //make a std::vector of std::vectors of ints
    std::vector<std::vector<int>> numberSquareOnHost{ {1,2}, {3,4} };
    //copy the values of each vector to the device
    std::vector<thrust::device_vector<int>> numberSquareDevice;
    for (auto& vector : numberSquareOnHost)
        numberSquareDevice.push_back(thrust::device_vector<int>(vector));
    //copy the raw pointers to start of the device vectors to a std::vector
    std::vector<int*> halfRawNumberSquareOnHost(2);
    for ( int i = 0; i < 2 ; ++i)
        halfRawNumberSquareOnHost[i] = (thrust::raw_pointer_cast(numberSquareOnHost[i].data()));
    //copy the raw pointers ot the device
    thrust::device_vector<int*> halfRawNumberSquareOnDevice(halfRawNumberSquareOnHost);
    //get raw pointer (on the device) to the raw pointers (on the device)
    int** rawNumberSquare = thrust::raw_pointer_cast(halfRawNumberSquareOnDevice.data());
    //call the kernel
    talkKernel <<<1,2 >>> ( rawNumberSquare, 2);
    cudaDeviceSynchronize();
    //ask what's up'
    std::cout << cudaGetErrorString(cudaGetLastError()) << "\n";
    return 0;

   /*output:
   * Quadro M2200
    in[0][0] = 0
    in[0][1] = 0
    in[1][0] = 0
    in[1][1] = 0
    an illegal memory access was encountered

    ...\vectorOfVectors.exe (process 6428) exited with code -1073740791.
        */
}

我还尝试了所有方法，例如使用new 分配主机指针到（原始设备）int*，而不是使用std::vector<int*> halfRawNumberSquareOnHost 并使用cudaMalloc 分配设备int** rawSquareOnDevice（并用@987654335 填充它@)。这并没有什么不同。

【问题讨论】：

int** 暗示您的 int 数据是以某种 C 风格的方式组织的。具体来说，某处有一个指针数组，每个指针指向每个thrust::device_vector 的起始数据。您可能需要在需要时创建此数组。
@DrewDormann 我显然误解了一些东西，但在我看来，在完成相关的复制步骤之后，std::vector<int*> halfRawNumberSquareOnHost 和 thrust::device_vector<int*> halfRawNumberSquareOnDevice(halfRawNumberSquareOnHost); 底层的数组都是指针数组，每个指针都指向每个指针的开头原来的thrust::device_vector。当然，打算他们是。

标签： c++ stl cuda thrust

【解决方案1】：

你的错误在这里：

halfRawNumberSquareOnHost[i] = (thrust::raw_pointer_cast(numberSquareOnHost[i].data()));

应该是：

halfRawNumberSquareOnHost[i] = (thrust::raw_pointer_cast(numberSquareDevice[i].data()));

第一个是获取主机指针（此时不是您想要的）。第二个是获取设备指针。换句话说，您构建 numberSquareDevice 是有原因的，但您发布的代码实际上并未使用它。

【讨论】：

呃，谢谢！修复它有点烦人，因为我实际上并没有在真正的代码中犯同样的错误，这是 supposed 是最小的失败案例，但我想我需要寻找一些东西类似。