如何将 RGBA 纹理转换为金属中的 Y 和 CbCr 纹理答案

【问题标题】：How to convert an RGBA texture to Y and CbCr textures in metal如何将 RGBA 纹理转换为金属中的 Y 和 CbCr 纹理
【发布时间】：2019-09-30 22:02:41
【问题描述】：

Apple 有一个名为 Displaying an AR Experience with Metal 的有用教程，向您展示了如何从 ARFrame 的 capturedImage 属性中提取 Y 和 CbCr 纹理并将它们转换为 RGB 以进行渲染。但是，我在尝试获取 RGBA 纹理并执行逆运算（即转换回 Y 和 CbCr 纹理）时遇到了问题。

我将教程中的片段着色器重写为计算着色器，它写入我从金属缓冲区创建的 rgba 纹理：

// Same as capturedImageFragmentShader but it's a kernel function instead
kernel void yCbCrToRgbKernel(texture2d<float, access::sample> yTexture [[ texture(kTextureIndex_Y) ]],
                             texture2d<float, access::sample> cbCrTexture [[ texture(kTextureIndex_CbCr) ]],
                             texture2d<float, access::write> rgbaTexture [[ texture(kTextureIndex_RGBA) ]],
                             uint2 gid [[ thread_position_in_grid ]])
{
    constexpr sampler colorSampler(mip_filter::linear, mag_filter::linear, min_filter::linear);

    const float4x4 ycbcrToRGBTransform = float4x4(
        float4(+1.0000f, +1.0000f, +1.0000f, +0.0000f),
        float4(+0.0000f, -0.3441f, +1.7720f, +0.0000f),
        float4(+1.4020f, -0.7141f, +0.0000f, +0.0000f),
        float4(-0.7010f, +0.5291f, -0.8860f, +1.0000f)
    );

    float4 ycbcr = float4(yTexture.sample(colorSampler, float2(gid)).r, cbCrTexture.sample(colorSampler, float2(gid)).rg, 1.0);
    float4 result = ycbcrToRGBTransform * ycbcr;
    rgbaTexture.write(result, ushort2(gid));
}

我尝试编写第二个计算着色器来执行反向操作，使用 YCbCr 的 wikipedia page 上的转换公式计算 Y、Cb 和 Cr 值：

kernel void rgbaToYCbCrKernel(texture2d<float, access::write> yTexture [[ texture(kTextureIndex_Y) ]],
                             texture2d<float, access::write> cbCrTexture [[ texture(kTextureIndex_CbCr) ]],
                             texture2d<float, access::sample> rgbaTexture [[ texture(kTextureIndex_RGBA) ]],
                             uint2 gid [[ thread_position_in_grid ]])
{
    constexpr sampler colorSampler(mip_filter::linear, mag_filter::linear, min_filter::linear);

    float4 rgba = rgbaTexture.sample(colorSampler, float2(gid)).rgba;

    // see https://en.wikipedia.org/wiki/YCbCr#ITU-R_BT.709_conversion for conversion formulae

    float Y = 16.0 + (65.481 * rgba.r + 128.553 * rgba.g + 24.966 * rgba.b);
    float Cb = 128 + (-37.797 * rgba.r + 74.203 * rgba.g + 112.0 * rgba.b);
    float Cr = 128 + (112.0 * rgba.r + 93.786 * rgba.g - 18.214 * rgba.b);

    yTexture.write(Y, gid);
    cbCrTexture.write(float4(Cb, Cr, 0, 0), gid); // this probably is not correct...
}

我的问题是如何正确地将数据写入这些纹理。我知道这是不正确的，因为结果显示是纯粉红色。预期的结果显然是原始的、未修改的显示。

Y、CbCr 和 RGBA 纹理的像素格式分别为 .r8UNorm、.rg8UNorm 和 rgba8UNorm。

这是我设置纹理和执行着色器的快速代码：

private func createTexture(fromPixelBuffer pixelBuffer: CVPixelBuffer, pixelFormat: MTLPixelFormat, planeIndex: Int) -> MTLTexture? {
        guard CVMetalTextureCacheCreate(kCFAllocatorSystemDefault, nil, device, nil, &capturedImageTextureCache) == kCVReturnSuccess else { return nil }

        var mtlTexture: MTLTexture? = nil
        let width = CVPixelBufferGetWidthOfPlane(pixelBuffer, planeIndex)
        let height = CVPixelBufferGetHeightOfPlane(pixelBuffer, planeIndex)

        var texture: CVMetalTexture? = nil
        let status = CVMetalTextureCacheCreateTextureFromImage(nil, capturedImageTextureCache!, pixelBuffer, nil, pixelFormat, width, height, planeIndex, &texture)
        if status == kCVReturnSuccess {
            mtlTexture = CVMetalTextureGetTexture(texture!)
        }

        return mtlTexture
    }

    func arFrameToRGB(frame: ARFrame) {

        let frameBuffer = frame.capturedImage

        CVPixelBufferLockBaseAddress(frameBuffer, CVPixelBufferLockFlags(rawValue: 0))

        // Extract Y and CbCr textures
        let capturedImageTextureY = createTexture(fromPixelBuffer: frameBuffer, pixelFormat: .r8Unorm, planeIndex: 0)!
        let capturedImageTextureCbCr = createTexture(fromPixelBuffer: frameBuffer, pixelFormat: .rg8Unorm, planeIndex: 1)!

        // create the RGBA texture
        let rgbaBufferWidth = CVPixelBufferGetWidthOfPlane(frameBuffer, 0)
        let rgbaBufferHeight = CVPixelBufferGetHeightOfPlane(frameBuffer, 0)
        if rgbaBuffer == nil {
            rgbaBuffer = device.makeBuffer(length: 4 * rgbaBufferWidth * rgbaBufferHeight, options: [])
        }

        let rgbaTextureDescriptor = MTLTextureDescriptor.texture2DDescriptor(pixelFormat: .rgba8Unorm, width: rgbaBufferWidth, height: rgbaBufferHeight, mipmapped: false)
        rgbaTextureDescriptor.usage = [.shaderWrite, .shaderRead]
        let rgbaTexture = rgbaBuffer?.makeTexture(descriptor: rgbaTextureDescriptor, offset: 0, bytesPerRow: 4 * rgbaBufferWidth)

        threadGroupSize = MTLSizeMake(4, 4, 1)
        threadGroupCount = MTLSizeMake((rgbaTexture!.width + threadGroupSize!.width - 1) / threadGroupSize!.width, (rgbaTexture!.height + threadGroupSize!.height - 1) / threadGroupSize!.height, 1)

        let yCbCrToRGBACommandBuffer = commandQueue.makeCommandBuffer()!
        let yCbCrToRGBAComputeEncoder = yCbCrToRGBACommandBuffer.makeComputeCommandEncoder()!
        yCbCrToRGBAComputeEncoder.setComputePipelineState(yCbCrToRgbPso)
        yCbCrToRGBAComputeEncoder.setTexture(capturedImageTextureY, index: Int(kTextureIndex_Y.rawValue))
        yCbCrToRGBAComputeEncoder.setTexture(capturedImageTextureCbCr, index: Int(kTextureIndex_CbCr.rawValue))
        yCbCrToRGBAComputeEncoder.setTexture(rgbaTexture, index: Int(kTextureIndex_RGBA.rawValue))
        yCbCrToRGBAComputeEncoder.dispatchThreadgroups(threadGroupCount!, threadsPerThreadgroup: threadGroupSize!)
        yCbCrToRGBAComputeEncoder.endEncoding()

        let rgbaToYCbCrCommandBuffer = commandQueue.makeCommandBuffer()!
        let rgbaToYCbCrComputeEncoder = rgbaToYCbCrCommandBuffer.makeComputeCommandEncoder()!
        rgbaToYCbCrComputeEncoder.setComputePipelineState(rgbaToYCbCrPso)
        rgbaToYCbCrComputeEncoder.setTexture(capturedImageTextureY, index: Int(kTextureIndex_Y.rawValue))
        rgbaToYCbCrComputeEncoder.setTexture(capturedImageTextureCbCr, index: Int(kTextureIndex_CbCr.rawValue))
        rgbaToYCbCrComputeEncoder.setTexture(rgbaTexture, index: Int(kTextureIndex_RGBA.rawValue))
        rgbaToYCbCrComputeEncoder.dispatchThreadgroups(threadGroupCount!, threadsPerThreadgroup: threadGroupSize!)
        rgbaToYCbCrComputeEncoder.endEncoding()

        yCbCrToRGBACommandBuffer.commit()
        rgbaToYCbCrCommandBuffer.commit()

        yCbCrToRGBACommandBuffer.waitUntilCompleted()
        rgbaToYCbCrCommandBuffer.waitUntilCompleted()

        CVPixelBufferUnlockBaseAddress(frameBuffer, CVPixelBufferLockFlags(rawValue: 0))
    }

最终目标是使用金属着色器对 rgba 纹理进行图像处理，并最终写回 Y 和 CbCr 纹理以显示在屏幕上。

以下是我不确定的部分

鉴于内核函数中纹理的类型是texture2d<float, access::write>，但它们具有不同的像素格式，我如何以正确的格式将数据写入这些纹理？
我在 Displaying an AR Experience with Metal 中将 capturedImageFragmentShader 重写为计算着色器是否像我想象的那么简单，还是我遗漏了什么？

【问题讨论】：

前段时间我写了一些金属纹理查看器，它做了一些非常接近你在这里寻找的东西，检查一下：github.com/eldade/EEMetalTextureViewer（有问题的着色器在这里github.com/eldade/EEMetalTextureViewer/blob/master/…）。还有一个示例程序可以从相机中抓取 YCbCr 数据并进行实时转换。
这不是我要找的，你只是有用于从 YCbCr -> RGB 转换的着色器
只是看了一眼，我注意到您没有指定采样器的坐标空间。如果将coord::pixel 添加到采样器构造函数的前面会发生什么（例如constexpr sampler colorSampler(coord::pixel, ...）？我之所以问，是因为片段函数似乎使用了光栅化器提供的标准化坐标，但您使用的工作项索引可能与像素坐标 1:1 对应。
尝试在 rgbaToYCbCrKernel 内核中将所有值分成 255。
要正确实现 RGB -> BT.709 -> RGB，需要处理很多问题。例如，您的代码在移动到 YCbCr 时不会转换为线性光。解码阶段的线性光也存在非常棘手的缩放问题。如果您有兴趣，这里是一个正确执行的示例项目（尽管 RGB -> YCbCr 不在 Metal 中）。 github.com/mdejong/MetalBT709Decoder

标签： ios swift gpu arkit metal

【解决方案1】：

我只需要实现同样的事情。您的第一个问题是存储在纹理缓冲区中的值与这些值在金属内核中的呈现方式之间存在混淆。与 GPU 着色器中的典型情况一样，当整数值作为浮点数访问时，它们在读取它们时被标准化为 [0,1]，并在写入时缩小为 [0,MaxIntValue]。对于金属，此转换记录在第 228 页 https://developer.apple.com/metal/Metal-Shading-Language-Specification.pdf 《7.7.1.1 将归一化整型像素数据类型转换为浮点值》。

例如，如果 Y 通道的纹理格式为 .r8UNorm，则数据以每个像素 1 个字节存储，值从 0 到 255。但是一旦通过texture2d<float> 在内核中访问，值将是在 [0,1] 中。当您写入这样的纹理时，这些值会自动缩小到 [0,255]。因此，在您的内核中，您应该考虑处理的是 [0,1] 而不是 [0,255] 内的值，并相应地调整您的转换。

第二个问题是 RGBA 到 YCbCr 的转换本身。假设来自 Apple 的样本是正确的，我们可以看到它们遵循wikipedia page 末尾给出的 JPEG 约定。如果将 128 替换为 128/255=0.5 并将其置于矩阵形式中，则系数完全匹配。额外的微妙之处在于，在 Metal 代码中，矩阵以列优先模式初始化，因此相应的数学运算应为：

       |+1.     +0.     +1.402  -0.701 |   |Y |
       |+1.     -0.3441 -0.7141 +0.5291|   |Cb|
RGBA = |+1.     +1.772  +0.     -0.886 | . |Cr|
       |+0.     +0.     +0.     +1.    |   |1 |

接下来你需要的是逆变换。您可以在维基百科页面的相同 JPEG 部分找到它（再次将 128 替换为 0.5），或者如果您想使用相同的矩阵形式，您可以简单地计算 4x4 矩阵的逆矩阵并使用它。这是what I did，我把它放回列专业后得到了：

const float4x4 rgbaToYcbcrTransform = float4x4(
   float4(+0.2990, -0.1687, +0.5000, +0.0000),
   float4(+0.5870, -0.3313, -0.4187, +0.0000),
   float4(+0.1140, +0.5000, -0.0813, +0.0000),
   float4(+0.0000, +0.5000, +0.5000, +1.0000)
);

然后像这样调整内核代码应该可以工作（我没有测试确切的代码，我的纹理布局略有不同）：

// Ignore alpha as we can't convert it, just set it to 1.
float3 rgb = rgbaTexture.sample(colorSampler, float2(gid)).rgb;
float4 ycbcr = rgbaToYcbcrTransform * float4(rgb, 1.0);    
yTexture.write(ycbcr[0], gid);
cbCrTexture.write(float4(ycbcr[1], ycbcr[2], 0, 0), gid);

【讨论】：