【问题标题】:does openCL support vectors as kernel arguments?openCL 支持向量作为内核参数吗?
【发布时间】:2020-10-01 18:54:22
【问题描述】:

我一直在想办法以 openCL 内核形式重写这段代码。转换不会特别困难(摆脱 glm 类型和位掩码),但我坚持的部分是如何将 _triangles_uvs_indices_normals 传递给内核。 openCL 中是否有内置向量的功能?

如果没有任何向量支持,我看到的唯一选择是为我需要返回的 3 个变量(_triangles_uvs_normals) 和_indices 的 2 个 float3 数组。然后在 CPU 中将数组转换回向量并缩小它们以适应。我不太确定将这么多内存缓冲区传递给内核是一种有效的方法,因为那将是从内核传递和返回的 14 个数组。并行化时我无法使用的其他解决方案。有没有办法简化这个解决方案,或者更好但更好的解决方案?

我遇到问题的函数是_addRectangle_createMesh 是它将在内核中组合的函数。

void Chunk::_addRectangle(glm::vec3 center, glm::vec3 height, glm::vec3 width, unsigned tex_num, cl_uint LOD)
{
    glm::vec3 corner1 = center - (height / 2.0) - (width / 2.0);
    glm::vec3 corner2 = center - (height / 2.0) + (width / 2.0);
    glm::vec3 corner3 = center + (height / 2.0) + (width / 2.0);
    glm::vec3 corner4 = center + (height / 2.0) - (width / 2.0);

    glm::vec3 normal = glm::cross(height, width);

    glm::vec2 uv1;
    glm::vec2 uv2;
    glm::vec2 uv3;
    glm::vec2 uv4;

    if (fabs(normal[1]) == 1.0)
    {
        uv1 = glm::vec2(1.0 / _tex_atlas_width, 1);
        uv2 = glm::vec2(1.0 / _tex_atlas_width, 0);
        uv3 = glm::vec2(0, 0);
        uv4 = glm::vec2(0, 1);
    }
    else
    {
        uv1 = glm::vec2(1.0 / _tex_atlas_width, 1);
        uv2 = glm::vec2(1.0 / _tex_atlas_width, 0);
        uv3 = glm::vec2(0, 0);
        uv4 = glm::vec2(0, 1);
    }

    float add = (1.0 / double(_tex_atlas_width)) * tex_num;
    uv1.x += add;
    uv2.x += add;
    uv3.x += add;
    uv4.x += add;

    // triangle 1
    _triangles.push_back(corner3);
    _triangles.push_back(corner2);
    _triangles.push_back(corner1);

    _normals.push_back(normal);
    _normals.push_back(normal);
    _normals.push_back(normal);

    _uvs.push_back(uv1);
    _uvs.push_back(uv2);
    _uvs.push_back(uv3);

    _indices.push_back(glm::ivec3(nrOfIndices + 0, nrOfIndices + 1, nrOfIndices + 2));

    // triangle 2 

    _triangles.push_back(corner4);
    _normals.push_back(normal);
    _uvs.push_back(uv4);


    _indices.push_back(glm::ivec3(nrOfIndices + 2, nrOfIndices + 3, nrOfIndices + 0));
    nrOfIndices += 4;

}

void Chunk::_createMesh(glm::ivec3 pos, int landmap_flags[96 * 96 * 96], cl_int LOD)
{
    std::byte* faces = new std::byte[chunkSize / LOD * chunkSize / LOD * chunkSize / LOD];

    int index = 0;

    // a index conversion from a single index array to a 3d array
    // landmap_flags[x + y * (chunkSize + (2 * LOD)) + z * (chunkSize + (2 * LOD)) * (chunkSize + (2 * LOD))] is

    for (int x = LOD; x < chunkSize + LOD; x += LOD) {
        for (int y = LOD; y < chunkSize + LOD; y += LOD) {
            for (int z = LOD; z < chunkSize + LOD; z += LOD) {
                x + y * (chunkSize + (2 * LOD)) + z * (chunkSize + (2 * LOD)) * (chunkSize + (2 * LOD));
                faces[index] = (std::byte)0;
                if (landmap_flags[x + y * (chunkSize + (2 * LOD)) + z * (chunkSize + (2 * LOD)) * (chunkSize + (2 * LOD))] == BLOCK::AIR)
                {
                    index++;
                    continue;
                }
                if (landmap_flags[x + y * (chunkSize + (2 * LOD)) + z * (chunkSize + (2 * LOD)) * (chunkSize + (2 * LOD))] != BLOCK::AIR)
                {
                    if (landmap_flags[(x - LOD) + y * (chunkSize + (2 * LOD)) + z * (chunkSize + (2 * LOD)) * (chunkSize + (2 * LOD))] == BLOCK::AIR)
                        faces[index] |= (std::byte)Direction::South;
                    if (landmap_flags[(x + LOD) + y * (chunkSize + (2 * LOD)) + z * (chunkSize + (2 * LOD)) * (chunkSize + (2 * LOD))] == BLOCK::AIR)
                        faces[index] |= (std::byte)Direction::North;
                    if (landmap_flags[x + (y - LOD) * (chunkSize + (2 * LOD)) + z * (chunkSize + (2 * LOD)) * (chunkSize + (2 * LOD))] == BLOCK::AIR)
                        faces[index] |= (std::byte)Direction::Down;
                    if (landmap_flags[x + (y + LOD) * (chunkSize + (2 * LOD)) + z * (chunkSize + (2 * LOD)) * (chunkSize + (2 * LOD))] == BLOCK::AIR)
                        faces[index] |= (std::byte)Direction::Up;
                    if (landmap_flags[x + y * (chunkSize + (2 * LOD)) + (z - LOD) * (chunkSize + (2 * LOD)) * (chunkSize + (2 * LOD))] == BLOCK::AIR)
                        faces[index] |= (std::byte)Direction::West;
                    if (landmap_flags[x + y * (chunkSize + (2 * LOD)) + (z + LOD) * (chunkSize + (2 * LOD)) * (chunkSize + (2 * LOD))] == BLOCK::AIR)
                        faces[index] |= (std::byte)Direction::East;
                }

                if (faces[index] == (std::byte)0)
                    continue;

                if ((faces[index] & (std::byte)Direction::North) != (std::byte)0)
                {
                    _addRectangle(
                        glm::vec3( x + ((chunkSize - 1) * pos.x) - (chunkSize / 2) - (float(LOD - 1) / 2) + (float(LOD) / 2),
                                   y + ((chunkSize - 1) * pos.y) - (chunkSize / 2) - (float(LOD - 1) / 2),
                                   z + ((chunkSize - 1) * pos.z) - (chunkSize / 2) - (float(LOD - 1) / 2)),
                        glm::vec3(0, LOD, 0),
                        glm::vec3(0, 0, -LOD),
                        landmap_flags[x + y * (chunkSize + (2 * LOD)) + z * (chunkSize + (2 * LOD)) * (chunkSize + (2 * LOD))],
                        LOD);
                }
                if ((faces[index] & (std::byte)Direction::East) != (std::byte)0)
                {
                    _addRectangle(
                        glm::vec3( x + ((chunkSize - 1) * pos.x) - (chunkSize / 2) - (float(LOD - 1) / 2),
                                   y + ((chunkSize - 1) * pos.y) - (chunkSize / 2) - (float(LOD - 1) / 2),
                                   z + ((chunkSize - 1) * pos.z) - (chunkSize / 2) - (float(LOD - 1) / 2) + (float(LOD) / 2)),
                        glm::vec3(0, LOD, 0),
                        glm::vec3(LOD, 0, 0),
                        landmap_flags[x + y * (chunkSize + (2 * LOD)) + z * (chunkSize + (2 * LOD)) * (chunkSize + (2 * LOD))],
                        LOD);
                }
                if ((faces[index] & (std::byte)Direction::South) != (std::byte)0)
                {
                    _addRectangle(
                        glm::vec3( x + ((chunkSize - 1) * pos.x) - (chunkSize / 2) - (float(LOD - 1) / 2) - (float(LOD) / 2),
                                   y + ((chunkSize - 1) * pos.y) - (chunkSize / 2) - (float(LOD - 1) / 2),
                                   z + ((chunkSize - 1) * pos.z) - (chunkSize / 2) - (float(LOD - 1) / 2)),
                        glm::vec3(0, LOD, 0),
                        glm::vec3(0, 0, LOD),
                        landmap_flags[x + y * (chunkSize + (2 * LOD)) + z * (chunkSize + (2 * LOD)) * (chunkSize + (2 * LOD))],
                        LOD);
                }
                if ((faces[index] & (std::byte)Direction::West) != (std::byte)0)
                {
                    _addRectangle(
                        glm::vec3( x + ((chunkSize - 1) * pos.x) - (chunkSize / 2) - (float(LOD - 1) / 2),
                                   y + ((chunkSize - 1) * pos.y) - (chunkSize / 2) - (float(LOD - 1) / 2),
                                   z + ((chunkSize - 1) * pos.z) - (chunkSize / 2) - (float(LOD - 1) / 2) - (float(LOD) / 2)),
                        glm::vec3(0, LOD, 0),
                        glm::vec3(-LOD, 0, 0),
                        landmap_flags[x + y * (chunkSize + (2 * LOD)) + z * (chunkSize + (2 * LOD)) * (chunkSize + (2 * LOD))],
                        LOD);
                }
                if ((faces[index] & (std::byte)Direction::Up) != (std::byte)0)
                {
                    _addRectangle(
                        glm::vec3( x + ((chunkSize - 1) * pos.x) - (chunkSize / 2) - (float(LOD - 1) / 2),
                                   y + ((chunkSize - 1) * pos.y) - (chunkSize / 2) - (float(LOD - 1) / 2) + (float(LOD) / 2),
                                   z + ((chunkSize - 1) * pos.z) - (chunkSize / 2) - (float(LOD - 1) / 2)),
                        glm::vec3(LOD, 0, 0),
                        glm::vec3(0, 0, LOD),
                        landmap_flags[x + y * (chunkSize + (2 * LOD)) + z * (chunkSize + (2 * LOD)) * (chunkSize + (2 * LOD))],
                        LOD);
                }
                if ((faces[index] & (std::byte)Direction::Down) != (std::byte)0)
                {
                    _addRectangle(
                        glm::vec3( x + ((chunkSize - 1) * pos.x) - (chunkSize / 2) - (float(LOD - 1) / 2),
                                   y + ((chunkSize - 1) * pos.y) - (chunkSize / 2) - (float(LOD - 1) / 2) - (float(LOD) / 2),
                                   z + ((chunkSize - 1) * pos.z) - (chunkSize / 2) - (float(LOD - 1) / 2)),
                        glm::vec3(LOD, 0, 0),
                        glm::vec3(0, 0, -LOD),
                        landmap_flags[x + y * (chunkSize + (2 * LOD)) + z * (chunkSize + (2 * LOD)) * (chunkSize + (2 * LOD))],
                        LOD);
                }
                index++;
            }
        }
    }

    delete[]faces;
}

谢谢!

编辑:存储数据的一种可能更有效的方法是使用几种 float4 类型。 例如:

const uint n = get_global_id(0);

float4 triangles{1, 2, 3, 4}; // calculated values for each vertex

//(float4 list[size];) from constructor
list[n] = triangles;

【问题讨论】:

    标签: c++ kernel opencl opencl-c


    【解决方案1】:

    OpenCL 中有float4 等矢量类型。更多信息可以阅读herec++ 中没有像 std::vector 这样的容器,因此必须使用 C 样式的数组传递数据。

    查看问题中的代码部分,_triangles_uvs_indices_normals 将填充结果,因此需要分配适当的缓冲区并将其传递给内核为了存储结果并在内核完成工作后将其读回。

    传递 14 个数组应该不是问题,只要内核足够计算密集并且查看代码可能是因为有 2 个嵌套循环。但看起来很大程度上取决于chunkSizeLOD 变量的大小。您需要尝试一下,看看它的表现如何。

    将数据复制回std::vector 应该没有任何问题 - 只需使用memcpy

    【讨论】:

    • 请注意,如果在缓冲区创建期间使用CL_MEM_USE_HOST_PTRstd::vectordata() 可以传递给 OpenCL,但在共享内存设备上除外可能会产生较差的性能,或者无论如何都会将数据复制到设备内存中。此外,您还必须担心设备访问期间的生命周期和避免主机端访问。
    • 另外,_triangles_uvs_indices_normals 必须按照特定顺序才能被 openGL 正确呈现,因此 memcpy 可能无法工作,因为所有数组都是压缩,但向量需要包含与函数具有相同的时间顺序的数据。
    • @HeartUnder8lade 如果您直接在 OpenGL 中使用输出数据,为什么不使用 OpenCL-OpenGL 桥接直接从您的 OpenCL 内核写入一个或多个 OpenGL 顶点/索引数组缓冲区?
    • @pmdj 我没有意识到这存在,你有任何文档或代码示例的链接吗?
    猜你喜欢
    • 2017-10-09
    • 2016-05-28
    • 2012-02-05
    • 1970-01-01
    • 2011-08-25
    • 2011-05-25
    • 2016-02-20
    • 1970-01-01
    相关资源
    最近更新 更多