【问题标题】:Errors when trying to offload to GTX-1050 with GCC9.3 and OpenMP尝试使用 GCC9.3 和 OpenMP 卸载到 GTX-1050 时出错
【发布时间】:2021-07-21 18:02:22
【问题描述】:

构建日志:

-------------- Clean: Release in OffloadTest (compiler: GNU GCC Compiler)---------------

Cleaned "OffloadTest - Release"

-------------- Build: Release in OffloadTest (compiler: GNU GCC Compiler)---------------

g++ -Wall -m64 -fopenmp -foffload=nvptx-none -fno-stack-protector -O2 -fopenmp -foffload=nvptx-none -fcf-protection=none -fno-stack-protector  -c /home/david/CBProjects/OffloadTest/main.cpp -o obj/Release/main.o
g++  -o bin/Release/OffloadTest obj/Release/main.o  -m64 -lgomp -s -lgomp  
/usr/bin/ld: /tmp/ccfvsLgk.crtoffloadtable.o:(.rodata+0x0): undefined reference to `__offload_func_table'
/usr/bin/ld: /tmp/ccfvsLgk.crtoffloadtable.o:(.rodata+0x8): undefined reference to `__offload_funcs_end'
/usr/bin/ld: /tmp/ccfvsLgk.crtoffloadtable.o:(.rodata+0x10): undefined reference to `__offload_var_table'
/usr/bin/ld: /tmp/ccfvsLgk.crtoffloadtable.o:(.rodata+0x18): undefined reference to `__offload_vars_end'
collect2: error: ld returned 1 exit status
Process terminated with status 1 (0 minute(s), 0 second(s))
5 error(s), 0 warning(s) (0 minute(s), 0 second(s))

我已加载以下内容(带有说明):

Gcc-9-offload-nvptx
    Description: The package provides offloading support for NVidia PTX. OpenMP and OpenACC programs linked with -fopenmp will by default add PTX code into the binaries, which can be offloaded to NVidia PTX capable devices if available.
Gcc-offload-nvptx
    Description: This package contains libgomp plugin for offloading to NVidia PTX. The plugin needs libcuda.so.1 shared library that has to be installed separately.
Nvptx-tools
    Description: This tool consists of nptx-non-as: "assembler" for PTX, nvptx-none-ld: "linker" for PTX. Additionally, the following symlinks are installed: nvptx-none-ar: link to the GNU/Linux host system's ar, nvptx-none-ranlib: link to the GNU/Linux host system's ranlib

我已经验证 libcuda.so.1 位于 /lib/x86_64-linux-gnu

脚本很简单,只是一个帮助我卸载和运行的示例。去掉“target”关键字就可以了

#include <iostream>
#include <omp.h>

using namespace std;
#define iSize 200000
long *A, *B;

int main()
{
   A = new long[iSize];
   B = new long[iSize];
   long sum = 0;
   double dStart, dEnd;
   int iNumberOfDevices = omp_get_num_devices();
   int iInitialDevice = omp_get_initial_device(); // device number for host computer
   int iDeviceNumber = omp_get_default_device();

   dStart = omp_get_wtime();
#pragma omp parallel for
   for (long i=0; i<iSize; i++)
   {
      A[i] = i;
      B[i] = i+1;
   }
#pragma omp target parallel for reduction(+:sum)
   for (long i=0; i<iSize; i++)
   {
      for (long j=0; j<iSize; j++)
      {
         sum += 3 * A[i] - B[j];
      }
   }
   dEnd = omp_get_wtime();
   double dtime = dEnd - dStart;
   cout << "Number of devices = " << iNumberOfDevices << endl;
   cout << "Device number = " << iDeviceNumber << endl;
   cout << "Initial Device number (host processor) = " << iInitialDevice << endl;
   cout << endl;
   cout << "Sum = " << sum << endl;
   cout << "Processing time = " << dtime << " Seconds" << endl;
}

感谢任何帮助。

  • 大卫

【问题讨论】:

    标签: c++ openmp linux-mint offloading


    【解决方案1】:

    要解析undefined references,请指定-fopenmp(如果不是默认设置,可能还会再次指定-foffload=nvptx-none)而不是-lgomp(顺便说一句,重复)。

    我认为还缺少一些 omp target data(或类似的)指令来在设备上设置 AB 数组?

    【讨论】:

    • 感谢您的周到回复@tschwinge。我没有意识到我的 IDE 自动添加了 -lgomp 开关,所以我删除了重复项。我已经有了 -foffload=nvptx-none 标志我在第二个 omp 中添加了 map 语句,用于指令 #pragma omp target parallel for map(to:iSize,A[0:iSize],B[0:iSize]) map(from :sum) 将define语句标记为错误:数字常量之前的预期不合格ID|将其更改为 long iSize = 200000;现在我回到了我开始时的相同错误消息。
    • 如果是#define,则不需要map iSize
    • 我的主要建议是也为链接调用指定(第二次调用g++-fopenmp(可能再次调用-foffload=nvptx-none如果这不是默认值)而不是 -lgomp.
    • 添加-lgomp的IDE是什么?似乎是 IDE 中的一个错误。
    • 我必须在某个时候将 -lgomp 放入发布通道中,因此不是 IDE 在编译和链接通道中使用 -fopenmp 解决了这个问题。非常感谢。
    猜你喜欢
    • 1970-01-01
    • 2019-02-14
    • 2018-08-13
    • 2014-06-10
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2016-03-16
    相关资源
    最近更新 更多