【发布时间】:2017-12-20 03:16:09
【问题描述】:
我对为什么我的内核着色器不工作感到困惑。
我有真正的原始 RGBA32 像素缓冲区 (inBuffer),我将其发送到内核着色器。我还有一个接收 MTLTexture,我在其 RGBA8Norm 描述符中将其用法设置为 MTLTextureUsageRenderTarget。
然后我就这样调度编码...
id<MTLLibrary> library = [_device newDefaultLibrary];
id<MTLFunction> kernelFunction = [library newFunctionWithName:@"stripe_Kernel"];
id<MTLComputePipelineState> pipeline = [_device newComputePipelineStateWithFunction:kernelFunction error:&error];
id<MTLCommandQueue> commandQueue = [_device newCommandQueue];
MTLTextureDescriptor *textureDescription = [MTLTextureDescriptor texture2DDescriptorWithPixelFormat:MTLPixelFormatRGBA8Unorm
width:outputSize.width
height:outputSize.height
mipmapped:NO];
[textureDescription setUsage:MTLTextureUsageRenderTarget];
id<MTLTexture> metalTexture = [_device newTextureWithDescriptor:textureDescription];
MTLSize threadgroupCounts = MTLSizeMake(8, 8, 1);
MTLSize threadgroups = MTLSizeMake([metalTexture width] / threadgroupCounts.width,
[metalTexture height] / threadgroupCounts.height, 1);
...
id<MTLBuffer> metalBuffer = [_device newBufferWithBytesNoCopy:inBuffer
length:inputByteCount
options:MTLResourceStorageModeShared
deallocator:nil];
[commandEncoder setComputePipelineState:pipeline];
[commandEncoder setTexture:metalTexture atIndex:0];
[commandEncoder setBuffer:metalBuffer offset:0 atIndex:0];
[commandEncoder setBytes:&imageW length:sizeof(ushort) atIndex:1];
[commandEncoder setBytes:&imageH length:sizeof(ushort) atIndex:2];
[commandEncoder dispatchThreadgroups:threadgroups threadsPerThreadgroup:threadgroupCounts];
[commandEncoder endEncoding];
[commandBuffer commit];
[commandBuffer waitUntilCompleted];
目的是获取一个 mxn 大小的原始图像,并将其打包成一个纹理,例如 2048x896。这是我的内核着色器:
kernel void stripe_Kernel(texture2d<float, access::write> outTexture [[ texture(0) ]],
device const float *inBuffer [[ buffer(0) ]],
device const ushort * imageWidth [[ buffer(1) ]],
device const ushort * imageHeight [[ buffer(2) ]],
uint2 gid [[ thread_position_in_grid ]])
{
const ushort imageW = *imageWidth;
const ushort imageH = *imageHeight;
const uint32_t textureW = outTexture.get_width(); // eg. 2048
uint32_t posX = gid.x; // eg. 0...2047
uint32_t posY = gid.y; // eg. 0...895
uint32_t sourceX = ((int)(posY/imageH)*textureW + posX) % imageW;
uint32_t sourceY = (int)(posY% imageH);
const uint32_t ptr = (sourceX + sourceY* imageW);
float pixel = inBuffer[ptr];
outTexture.write(pixel, gid);
}
我稍后抓取该纹理缓冲区并将其转换为 CVPixelBuffer:
MTLRegion region = MTLRegionMake2D(0, 0, (int)outputSize.width, (int)outputSize.height);
// lock buffers, copy texture over
CVPixelBufferLockBaseAddress(outBuffer, 0);
void *pixelData = CVPixelBufferGetBaseAddress(outBuffer);
[metalTexture getBytes:CVPixelBufferGetBaseAddress(outBuffer)
bytesPerRow:CVPixelBufferGetBytesPerRow(outBuffer)
fromRegion:region
mipmapLevel:0];
CVPixelBufferUnlockBaseAddress(outBuffer, 0);
我的问题是,我的 CVPixelBuffer 总是空的(已分配但为零)。在配备 Radeon M395 GPU 的 iMac 17,1 上运行。
我什至已经将不透明的红色像素撞到内核着色器的输出纹理中。不过,我什至看不到红色。
更新:我对这个问题的解决方案是完全放弃使用 MTLTextures(我什至尝试过使用 MTLBlitCommandEncoder 同步纹理)——没有骰子。
我最终将 MTLBuffers 用于输入“纹理”和输出“纹理”,并在内核着色器中重新计算数学。我的输出缓冲区现在是一个预先分配的、锁定的 CVPixelBuffer,这正是我最终想要的。
【问题讨论】:
-
感谢肯提供信息。添加强制纹理同步没有帮助。我现在已经放弃了纹理,并让它与一组包含 RGB32A 像素数据的输入/输出 MTLBuffers 一起工作。现在,如果我的源缓冲区是 24 位的,我会遇到问题...stackoverflow.com/questions/45130709/…
标签: metal