【问题标题】:CL_INVALID_KERNEL_ARGS in JOCL (a Java Binding to OpenCL).JOCL(与 OpenCL 的 Java 绑定)中的 CL_INVALID_KERNEL_ARGS。
【发布时间】:2015-10-28 12:22:24
【问题描述】:

有人在JOCL中做矩阵乘法时遇到这种错误吗?

Exception in thread "main" org.jocl.CLException: CL_INVALID_KERNEL_ARGS
at org.jocl.CL.checkResult(CL.java:787)
at org.jocl.CL.clEnqueueNDRangeKernel(CL.java:20802)
at org.jocl.samples.JOCLSample.main(JOCLSample.java:147)

我编辑了他们的示例 HelloJOCL.java 来进行矩阵乘法计算以及 matrixMul.cl(内核代码)。这是导致错误的内核参数:

// Create the kernel
    cl_kernel kernel = clCreateKernel(program, "matrixMul", null);

    long time = nanoTime();
    // Set the arguments for the kernel
    clSetKernelArg(kernel, 0, 
        Sizeof.cl_mem, Pointer.to(memObjects[0]));
    clSetKernelArg(kernel, 1, 
        Sizeof.cl_mem, Pointer.to(memObjects[1]));
    clSetKernelArg(kernel, 2, 
        Sizeof.cl_mem, Pointer.to(memObjects[2]));

工作项尺寸代码:

// Set the work-item dimensions
    long global_work_size[] = new long[]{n};
    long local_work_size[] = new long[]{1};


// Execute the kernel
    clEnqueueNDRangeKernel(commandQueue, kernel, 1, null,
    global_work_size, null, 0, null, null);

还有内核代码:

private static String programSource =
        "__kernel void "+
                "matrixMul(__global float* C,"+ 
                "          __global float* A,"+ 
                "          __global float* B,"+ 
                "          int wA, int wB)"+
                "{"+
                   "int x = get_global_id(0);"+ 
                   "int y = get_global_id(1);"+

                   "float value = 0;"+
                   "for (int k = 0; k < wA; ++k)"+
                   "{"+
                   "   float elementA = A[y * wA + k];"+
                   "   float elementB = B[k * wB + x];"+
                   "   value += elementA * elementB;"+
                   "}"+
                  "C[y * wA + x] = value;"+
                "}";

【问题讨论】:

    标签: java opencl gpgpu matrix-multiplication jocl


    【解决方案1】:

    核函数定义为

    __kernel void matrixMul(__global float* C,
                            __global float* A,
                            __global float* B,
                            int wA, int wB)
    

    因此需要五个参数。您只提供前 三个 参数,即表示 float* 值的内存对象。为了启动这个内核,您必须传入 all 参数的值。在您的情况下,这可能大致如下所示:

    int a=0;
    clSetKernelArg(kernel, a++, 
        Sizeof.cl_mem, Pointer.to(memObjects[0]));
    clSetKernelArg(kernel, a++, 
        Sizeof.cl_mem, Pointer.to(memObjects[1]));
    clSetKernelArg(kernel, a++, 
        Sizeof.cl_mem, Pointer.to(memObjects[2]));
    
    // These have been missing:
    clSetKernelArg(kernel, a++, 
        Sizeof.cl_int, Pointer.to(new int[]{ wA }));
    clSetKernelArg(kernel, a++, 
        Sizeof.cl_int, Pointer.to(new int[]{ wB }));
    

    【讨论】:

    • 谢谢@Marco13。程序现在运行! :-)
    【解决方案2】:

    您的内核代码显示了 5 个输入参数,C、A、B、wA wB。 但我只看到此处列出了 3 个 clSetKernelSrg 调用。

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 2012-03-23
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 2013-09-14
      • 1970-01-01
      • 1970-01-01
      • 2021-05-19
      相关资源
      最近更新 更多