【发布时间】:2019-09-30 22:02:41
【问题描述】:
Apple 有一个名为 Displaying an AR Experience with Metal 的有用教程,向您展示了如何从 ARFrame 的 capturedImage 属性中提取 Y 和 CbCr 纹理并将它们转换为 RGB 以进行渲染。但是,我在尝试获取 RGBA 纹理并执行逆运算(即转换回 Y 和 CbCr 纹理)时遇到了问题。
我将教程中的片段着色器重写为计算着色器,它写入我从金属缓冲区创建的 rgba 纹理:
// Same as capturedImageFragmentShader but it's a kernel function instead
kernel void yCbCrToRgbKernel(texture2d<float, access::sample> yTexture [[ texture(kTextureIndex_Y) ]],
texture2d<float, access::sample> cbCrTexture [[ texture(kTextureIndex_CbCr) ]],
texture2d<float, access::write> rgbaTexture [[ texture(kTextureIndex_RGBA) ]],
uint2 gid [[ thread_position_in_grid ]])
{
constexpr sampler colorSampler(mip_filter::linear, mag_filter::linear, min_filter::linear);
const float4x4 ycbcrToRGBTransform = float4x4(
float4(+1.0000f, +1.0000f, +1.0000f, +0.0000f),
float4(+0.0000f, -0.3441f, +1.7720f, +0.0000f),
float4(+1.4020f, -0.7141f, +0.0000f, +0.0000f),
float4(-0.7010f, +0.5291f, -0.8860f, +1.0000f)
);
float4 ycbcr = float4(yTexture.sample(colorSampler, float2(gid)).r, cbCrTexture.sample(colorSampler, float2(gid)).rg, 1.0);
float4 result = ycbcrToRGBTransform * ycbcr;
rgbaTexture.write(result, ushort2(gid));
}
我尝试编写第二个计算着色器来执行反向操作,使用 YCbCr 的 wikipedia page 上的转换公式计算 Y、Cb 和 Cr 值:
kernel void rgbaToYCbCrKernel(texture2d<float, access::write> yTexture [[ texture(kTextureIndex_Y) ]],
texture2d<float, access::write> cbCrTexture [[ texture(kTextureIndex_CbCr) ]],
texture2d<float, access::sample> rgbaTexture [[ texture(kTextureIndex_RGBA) ]],
uint2 gid [[ thread_position_in_grid ]])
{
constexpr sampler colorSampler(mip_filter::linear, mag_filter::linear, min_filter::linear);
float4 rgba = rgbaTexture.sample(colorSampler, float2(gid)).rgba;
// see https://en.wikipedia.org/wiki/YCbCr#ITU-R_BT.709_conversion for conversion formulae
float Y = 16.0 + (65.481 * rgba.r + 128.553 * rgba.g + 24.966 * rgba.b);
float Cb = 128 + (-37.797 * rgba.r + 74.203 * rgba.g + 112.0 * rgba.b);
float Cr = 128 + (112.0 * rgba.r + 93.786 * rgba.g - 18.214 * rgba.b);
yTexture.write(Y, gid);
cbCrTexture.write(float4(Cb, Cr, 0, 0), gid); // this probably is not correct...
}
我的问题是如何正确地将数据写入这些纹理。我知道这是不正确的,因为结果显示是纯粉红色。预期的结果显然是原始的、未修改的显示。
Y、CbCr 和 RGBA 纹理的像素格式分别为 .r8UNorm、.rg8UNorm 和 rgba8UNorm。
这是我设置纹理和执行着色器的快速代码:
private func createTexture(fromPixelBuffer pixelBuffer: CVPixelBuffer, pixelFormat: MTLPixelFormat, planeIndex: Int) -> MTLTexture? {
guard CVMetalTextureCacheCreate(kCFAllocatorSystemDefault, nil, device, nil, &capturedImageTextureCache) == kCVReturnSuccess else { return nil }
var mtlTexture: MTLTexture? = nil
let width = CVPixelBufferGetWidthOfPlane(pixelBuffer, planeIndex)
let height = CVPixelBufferGetHeightOfPlane(pixelBuffer, planeIndex)
var texture: CVMetalTexture? = nil
let status = CVMetalTextureCacheCreateTextureFromImage(nil, capturedImageTextureCache!, pixelBuffer, nil, pixelFormat, width, height, planeIndex, &texture)
if status == kCVReturnSuccess {
mtlTexture = CVMetalTextureGetTexture(texture!)
}
return mtlTexture
}
func arFrameToRGB(frame: ARFrame) {
let frameBuffer = frame.capturedImage
CVPixelBufferLockBaseAddress(frameBuffer, CVPixelBufferLockFlags(rawValue: 0))
// Extract Y and CbCr textures
let capturedImageTextureY = createTexture(fromPixelBuffer: frameBuffer, pixelFormat: .r8Unorm, planeIndex: 0)!
let capturedImageTextureCbCr = createTexture(fromPixelBuffer: frameBuffer, pixelFormat: .rg8Unorm, planeIndex: 1)!
// create the RGBA texture
let rgbaBufferWidth = CVPixelBufferGetWidthOfPlane(frameBuffer, 0)
let rgbaBufferHeight = CVPixelBufferGetHeightOfPlane(frameBuffer, 0)
if rgbaBuffer == nil {
rgbaBuffer = device.makeBuffer(length: 4 * rgbaBufferWidth * rgbaBufferHeight, options: [])
}
let rgbaTextureDescriptor = MTLTextureDescriptor.texture2DDescriptor(pixelFormat: .rgba8Unorm, width: rgbaBufferWidth, height: rgbaBufferHeight, mipmapped: false)
rgbaTextureDescriptor.usage = [.shaderWrite, .shaderRead]
let rgbaTexture = rgbaBuffer?.makeTexture(descriptor: rgbaTextureDescriptor, offset: 0, bytesPerRow: 4 * rgbaBufferWidth)
threadGroupSize = MTLSizeMake(4, 4, 1)
threadGroupCount = MTLSizeMake((rgbaTexture!.width + threadGroupSize!.width - 1) / threadGroupSize!.width, (rgbaTexture!.height + threadGroupSize!.height - 1) / threadGroupSize!.height, 1)
let yCbCrToRGBACommandBuffer = commandQueue.makeCommandBuffer()!
let yCbCrToRGBAComputeEncoder = yCbCrToRGBACommandBuffer.makeComputeCommandEncoder()!
yCbCrToRGBAComputeEncoder.setComputePipelineState(yCbCrToRgbPso)
yCbCrToRGBAComputeEncoder.setTexture(capturedImageTextureY, index: Int(kTextureIndex_Y.rawValue))
yCbCrToRGBAComputeEncoder.setTexture(capturedImageTextureCbCr, index: Int(kTextureIndex_CbCr.rawValue))
yCbCrToRGBAComputeEncoder.setTexture(rgbaTexture, index: Int(kTextureIndex_RGBA.rawValue))
yCbCrToRGBAComputeEncoder.dispatchThreadgroups(threadGroupCount!, threadsPerThreadgroup: threadGroupSize!)
yCbCrToRGBAComputeEncoder.endEncoding()
let rgbaToYCbCrCommandBuffer = commandQueue.makeCommandBuffer()!
let rgbaToYCbCrComputeEncoder = rgbaToYCbCrCommandBuffer.makeComputeCommandEncoder()!
rgbaToYCbCrComputeEncoder.setComputePipelineState(rgbaToYCbCrPso)
rgbaToYCbCrComputeEncoder.setTexture(capturedImageTextureY, index: Int(kTextureIndex_Y.rawValue))
rgbaToYCbCrComputeEncoder.setTexture(capturedImageTextureCbCr, index: Int(kTextureIndex_CbCr.rawValue))
rgbaToYCbCrComputeEncoder.setTexture(rgbaTexture, index: Int(kTextureIndex_RGBA.rawValue))
rgbaToYCbCrComputeEncoder.dispatchThreadgroups(threadGroupCount!, threadsPerThreadgroup: threadGroupSize!)
rgbaToYCbCrComputeEncoder.endEncoding()
yCbCrToRGBACommandBuffer.commit()
rgbaToYCbCrCommandBuffer.commit()
yCbCrToRGBACommandBuffer.waitUntilCompleted()
rgbaToYCbCrCommandBuffer.waitUntilCompleted()
CVPixelBufferUnlockBaseAddress(frameBuffer, CVPixelBufferLockFlags(rawValue: 0))
}
最终目标是使用金属着色器对 rgba 纹理进行图像处理,并最终写回 Y 和 CbCr 纹理以显示在屏幕上。
以下是我不确定的部分
鉴于内核函数中纹理的类型是
texture2d<float, access::write>,但它们具有不同的像素格式,我如何以正确的格式将数据写入这些纹理?我在 Displaying an AR Experience with Metal 中将
capturedImageFragmentShader重写为计算着色器是否像我想象的那么简单,还是我遗漏了什么?
【问题讨论】:
-
前段时间我写了一些金属纹理查看器,它做了一些非常接近你在这里寻找的东西,检查一下:github.com/eldade/EEMetalTextureViewer(有问题的着色器在这里github.com/eldade/EEMetalTextureViewer/blob/master/…)。还有一个示例程序可以从相机中抓取 YCbCr 数据并进行实时转换。
-
这不是我要找的,你只是有用于从 YCbCr -> RGB 转换的着色器
-
只是看了一眼,我注意到您没有指定采样器的坐标空间。如果将
coord::pixel添加到采样器构造函数的前面会发生什么(例如constexpr sampler colorSampler(coord::pixel, ...)?我之所以问,是因为片段函数似乎使用了光栅化器提供的标准化坐标,但您使用的工作项索引可能与像素坐标 1:1 对应。 -
尝试在 rgbaToYCbCrKernel 内核中将所有值分成 255。
-
要正确实现 RGB -> BT.709 -> RGB,需要处理很多问题。例如,您的代码在移动到 YCbCr 时不会转换为线性光。解码阶段的线性光也存在非常棘手的缩放问题。如果您有兴趣,这里是一个正确执行的示例项目(尽管 RGB -> YCbCr 不在 Metal 中)。 github.com/mdejong/MetalBT709Decoder