OpenCL enqueueNDRangeKernel 导致访问冲突错误答案

【问题标题】：OpenCL enqueueNDRangeKernel causes Access Violation errorOpenCL enqueueNDRangeKernel 导致访问冲突错误
【发布时间】：2011-12-19 20:35:15
【问题描述】：

我正在尝试构建的所有内核都不断出现访问冲突错误。我从书中获取的其他内核似乎工作正常。

https://github.com/ssarangi/VideoCL - 这是代码所在的位置。

这似乎缺少一些东西。有人可以帮我解决这个问题。

非常感谢。

[James] - 谢谢你的建议，你是对的。我正在使用 AMD Redwood 卡在 Win 7 上执行此操作。我有带有 AMD APP SDK 2.5 的 Catalyst 11.7 驱动程序。我在下面发布代码。

#include <iostream>
#include "bmpfuncs.h"

#include "CLManager.h"

void main()
{
    float theta = 3.14159f/6.0f;
    int W ;
    int H ;

    const char* inputFile = "input.bmp";
    const char* outputFile = "output.bmp";

    float* ip = readImage(inputFile, &W, &H);
    float *op = new float[W*H];

    //We assume that the input image is the array “ip”
    //and the angle of rotation is theta
    float cos_theta = cos(theta);
    float sin_theta = sin(theta);

    try
    {
        CLManager* clMgr = new CLManager();

        // Build the Source
        unsigned int pgmID = clMgr->buildSource("rotation.cl");

        // Create the kernel
        cl::Kernel* kernel = clMgr->makeKernel(pgmID, "img_rotate");

        // Create the memory Buffers
        cl::Buffer* clIp = clMgr->createBuffer(CL_MEM_READ_ONLY, W*H*sizeof(float));
        cl::Buffer* clOp = clMgr->createBuffer(CL_MEM_READ_WRITE, W*H*sizeof(float));

        // Get the command Queue
        cl::CommandQueue* queue = clMgr->getCmdQueue();
        queue->enqueueWriteBuffer(*clIp, CL_TRUE, 0, W*H*sizeof(float), ip);

        // Set the arguments to the kernel
        kernel->setArg(0, clOp);
        kernel->setArg(1, clIp);
        kernel->setArg(2, W);
        kernel->setArg(3, H);
        kernel->setArg(4, sin_theta);
        kernel->setArg(5, cos_theta);

        // Run the kernel on specific NDRange
        cl::NDRange globalws(W, H);


        queue->enqueueNDRangeKernel(*kernel, cl::NullRange, globalws, cl::NullRange);

        queue->enqueueReadBuffer(*clOp, CL_TRUE, 0, W*H*sizeof(float), op);

        storeImage(op, outputFile, H, W, inputFile);
    }
    catch(cl::Error error)
    {
        std::cout << error.what() << "(" << error.err() << ")" << std::endl;
    }
}

我在 queue->enqueueNDRangeKernel 行收到错误。我将队列和内核存储在一个类中。

CLManager::CLManager()
    : m_programIDs(-1)
{
    // Initialize the Platform
    cl::Platform::get(&m_platforms);

    // Create a Context
    cl_context_properties cps[3] = {
        CL_CONTEXT_PLATFORM,
        (cl_context_properties)(m_platforms[0])(),
        0
    };

    m_context = cl::Context(CL_DEVICE_TYPE_GPU, cps);

    // Get a list of devices on this platform
    m_devices = m_context.getInfo<CL_CONTEXT_DEVICES>();

    cl_int err;

    m_queue = new cl::CommandQueue(m_context, m_devices[0], 0, &err);
}


cl::Kernel* CLManager::makeKernel(unsigned int programID, std::string kernelName)
{
    cl::CommandQueue queue = cl::CommandQueue(m_context, m_devices[0]);

    cl::Kernel* kernel = new cl::Kernel(*(m_programs[programID]), kernelName.c_str());

    m_kernels.push_back(kernel);

    return kernel;
}

【问题讨论】：

嗨，萨兰吉。你没有提到足够的在这里得到认真的帮助。您应该告诉我们您的平台和 CL 实现，并发布有问题的代码，包括您已经为使事情自行工作所做的工作。链接到外部存储库是个坏主意，因为该链接在未来可能无效。

标签： opencl

【解决方案1】：

我检查了您的代码。虽然我在 Linux 上。在运行时我收到错误-38，这意味着CL_INVALID_MEM_OBJECT。所以我去检查了你的缓冲区。

cl::Buffer* clIp = clMgr->createBuffer(CL_MEM_READ_ONLY, W*H*sizeof(float));
cl::Buffer* clOp = clMgr->createBuffer(CL_MEM_READ_WRITE, W*H*sizeof(float));

然后你将缓冲区作为指针传递：

kernel->setArg(0, clOp);
kernel->setArg(1, clIp);

但是 setArg 需要一个值，所以缓冲区指针应该被取消引用：

kernel->setArg(0, *clOp);
kernel->setArg(1, *clIp);

在这些变化之后，猫会旋转；）

【讨论】：

添加数组可能与显示的代码一样正常，因为您将数组（但实际上是指向第一个值的指针）作为内核参数传递。
非常感谢。忽略这一点是我的愚蠢。这实际上确实解决了问题。再次感谢。