【问题标题】:randomly segmentation fault on GPU/OpenCL/OpenGL codeGPU/OpenCL/OpenGL 代码上的随机分段错误
【发布时间】:2013-05-01 03:09:29
【问题描述】:

我正在编写 GPU/OpenCL NBody 代码。我使用 AMD APP SDK 的 OpenGL 渲染粒子位置。运行代码时,我随机出现分段错误。

总而言之,我有一个 GLWidget,我可以在其中进行 OpenGL 渲染。生成初始位置后,我将它们呈现在此 GLWidget 中。之后,我运行模拟并在每一步计算下一个位置并将它们显示在 GLwidget 中。我的问题是,有时,如果我在模拟运行时单击参数 GUI 的“生成初始条件”按钮,则会出现分段错误:

这是回溯:

Program received signal SIGSEGV, Segmentation fault.
0x00007ffff4a46cd7 in memcpy () from /lib/libc.so.6
(gdb) bt
#0  0x00007ffff4a46cd7 in memcpy () from /lib/libc.so.6
#1  0x00007fffeda2da64 in ?? () from /usr/lib/x86_64-linux-gnu/dri/fglrx_dri.so
#2  0x00007fffedbba74a in ?? () from /usr/lib/x86_64-linux-gnu/dri/fglrx_dri.so
#3  0x00007fffedbba9af in ?? () from /usr/lib/x86_64-linux-gnu/dri/fglrx_dri.so
#4  0x00007fffed9c56e4 in ?? () from /usr/lib/x86_64-linux-gnu/dri/fglrx_dri.so
#5  0x00007fffed17371d in ?? () from /usr/lib/x86_64-linux-gnu/dri/fglrx_dri.so
#6  0x000000000040b185 in GLWidget::createVBO() ()
#7  0x000000000040b3c9 in GLWidget::draw() ()
#8  0x000000000040c36d in GLWidget::processCurrent() ()
...

这是createVBO 例程:

void GLWidget::createVBO()
{   
  GLuint vbo;
  int memSize = sizeof(cl_double4) * 4 * Galaxy->getNumParticles();
  glGenBuffers(1, &vbo);
  glBindBuffer(GL_ARRAY_BUFFER, vbo);
  glBufferData(GL_ARRAY_BUFFER, memSize, Galaxy->pos, GL_DYNAMIC_DRAW);
}

段错误发生在glBufferData(GL_ARRAY_BUFFER, memSize, Galaxy->pos, GL_DYNAMIC_DRAW);

我不明白为什么会这样。当我按下“生成 IC”按钮时,我会删除分配的 Galaxy->pos 数组并创建一个新数组。

这是我在“生成 IC”例程中所做的:

  //Clean Galaxy already existing 
  if (parent->widget_2->isGalaxyExist)
  { 
    if (parent->widget_2->animation)
      parent->resetSimu();
    parent->widget_2->Galaxy->cleanup();
  }

使用cleanup 例程(我删除pos 数组):

int NBody::cleanup()
{
  if (glEvent)
    clReleaseEvent(glEvent);

  // Releases OpenCL resources (Context, Memory etc.)
  cl_int status;

  if (hasRunKernel)
  {
  status = clFinish(commandQueue);
  CHECK_OPENCL_ERROR(status, "clFinish failed.(commandQueue)");

  status = clReleaseKernel(kernel);
  CHECK_OPENCL_ERROR(status, "clReleaseKernel failed.(kernel)");

  status = clReleaseProgram(program);
  CHECK_OPENCL_ERROR(status, "clReleaseProgram failed.(program)");

  status = clReleaseMemObject(currPos);
  CHECK_OPENCL_ERROR(status, "clReleaseMemObject failed.(currPos)");

  status = clReleaseMemObject(currVel);
  CHECK_OPENCL_ERROR(status, "clReleaseMemObject failed.(currVel)");

  status = clReleaseMemObject(newPos);
  CHECK_OPENCL_ERROR(status, "clReleaseMemObject failed.(newPos)");

  status = clReleaseMemObject(newVel);
  CHECK_OPENCL_ERROR(status, "clReleaseMemObject failed.(newVel)");

  status = clReleaseCommandQueue(commandQueue);
  CHECK_OPENCL_ERROR(status, "clReleaseCommandQueue failed.(commandQueue)");

  status = clReleaseContext(context);
  CHECK_OPENCL_ERROR(status, "clReleaseContext failed.(context)");

  hasRunKernel = false;
  }

  // Release program resources 
  delete [] pos;
  delete [] vel;
  delete [] initPos;
  delete [] initVel;
  delete [] devices;
  // Delete current instance
  delete this;

  return NBODY_SUCCESS;
}

乍一看,你能看出哪里出了问题,或者给我一个关于这个段错误的线索。最烦人的是错误是随机发生的,不是每次执行都发生。

【问题讨论】:

    标签: opengl segmentation-fault opencl


    【解决方案1】:

    这个计算正确吗?

    int memSize = sizeof(cl_double4) * 4 * Galaxy->getNumParticles();
    

    特别是 " * 4": sizeof(cl_double4) 已经考虑了向量的四个元素。

    【讨论】:

      【解决方案2】:

      这样的崩溃表明在通过glBufferData OpenGL API 函数调用的驱动程序代码中存在越界访问。检查传递给glBufferData的参数是否正确,即赋予glBufferData读取的长度是否在作为数据参数传递的内存范围内。

      【讨论】:

        猜你喜欢
        • 1970-01-01
        • 2022-01-24
        • 1970-01-01
        • 1970-01-01
        • 1970-01-01
        • 2015-10-30
        • 1970-01-01
        • 1970-01-01
        • 2014-05-06
        相关资源
        最近更新 更多