ARKit 条形码跟踪和视觉框架答案

【问题标题】：ARKit Barcode tracking and Vision frameworkARKit 条形码跟踪和视觉框架
【发布时间】：2019-02-02 01:20:33
【问题描述】：

我一直在尝试为 ARSession 期间检测到的 QR 码绘制边界框。结果是： boundingbox 1 boundingbox 2

条形码正在被追踪，但边界框的几何形状错误。

如何获取边界框的正确坐标？

源码为：

 public func session(_ session: ARSession, didUpdate frame: ARFrame) {

     // Only run one Vision request at a time
     if self.processing {
         return
     }

    self.processing = true

    let request = VNDetectBarcodesRequest { (request, error) in

        if let results = request.results, let result = results.first as? VNBarcodeObservation {

            DispatchQueue.main.async {

                let path = CGMutablePath()

                for result in results {
                    guard let barcode = result as? VNBarcodeObservation else { continue }
                    let topLeft = self.convert(point: barcode.topLeft)
                    path.move(to: topLeft)
                    let topRight = self.convert(point: barcode.topRight)
                    path.addLine(to: topRight)
                    let bottomRight = self.convert(point: barcode.bottomRight)
                    path.addLine(to: bottomRight)
                    let bottomLeft = self.convert(point: barcode.bottomLeft)
                    path.addLine(to: bottomLeft)
                    path.addLine(to: topLeft)
                }                   
                self.drawLayer.path = path
                self.processing = false
            }
        } else {
            self.processing = false
        }
    }

    DispatchQueue.global(qos: .userInitiated).async {
        do {
            request.symbologies = [.QR]
            let imageRequestHandler = VNImageRequestHandler(cvPixelBuffer: frame.capturedImage, orientation: .right, options: [:])                
            try imageRequestHandler.perform([request])
        } catch {               
        }
    }
}

 private func convert(point: CGPoint) -> CGPoint {
     return CGPoint(x: point.x * view.bounds.size.width,
                   y: (1 - point.y) * view.bounds.size.height)
 }

【问题讨论】：

如果您按住 Command 键单击 VNRectangleObservation，该类的文档说您应该使用 CIPerspectiveTransform 来整理它。不过，我不确定它是否会解决问题。可能与当前帧和找到代码的帧之间的延迟有关。
对我有用的唯一方法是拍摄快照： let snapshot = self.sceneView.snapshot().rotate(radians: -.pi/2) 但这种方式不好，因为我有拍摄在跟踪期间已拍摄的帧的快照，并且快照分辨率较低。我想正常的方式一定存在。
可以和'orientation: .right'有关吗？也许它应该是“.up”？
我尝试了不同的方向，这不是方向的情况。当我从 ARFrame 和 Snapshot 中获取图像帧时，图像具有不同的尺寸和不同的内容，就好像两张图像是以不同的视角拍摄的一样。
我会尝试将一些 ARFrames 作为 CGImages 保存到文件系统并在保存的图像上运行 VNDetectBarcodesRequest 以了解发生了什么。

标签： swift barcode arkit apple-vision

【解决方案1】：

我刚刚将我的应用程序中的条形码识别从 AVFoundation 迁移到 Vision，以下是对我有用的概述逻辑：

extension CVPixelBuffer {
    var size: CGSize {
        get {
            let width = CGFloat(CVPixelBufferGetWidth(self))
            let height = CGFloat(CVPixelBufferGetHeight(self))
            return CGSize(width: width, height: height)
        }
    }
}
extension VNRectangleObservation {    
    func outline(in cvPixelBuffer: CVPixelBuffer, with color: UIColor) -> CALayer {
        let outline = CAShapeLayer()
        outline.path = self.path(in: cvPixelBuffer).cgPath
        outline.fillColor = UIColor.clear.cgColor
        outline.strokeColor =  color.cgColor
        return outline
    }
    
    func path(in cvPixelBuffer: CVPixelBuffer) -> UIBezierPath {
        let size = cvPixelBuffer.size
        let transform = CGAffineTransform.identity
            .scaledBy(x: 1, y: -1)
            .translatedBy(x: 0, y: -size.height)
            .scaledBy(x: size.width, y: size.height)
        
        let convertedTopLeft = self.topLeft.applying(transform)
        let convertedTopRight = self.topRight.applying(transform)
        let convertedBottomLeft = self.bottomLeft.applying(transform)
        let convertedBottomRight = self.bottomRight.applying(transform)
        
        let path = UIBezierPath()
        path.move(to: convertedTopLeft)
        path.addLine(to: convertedTopRight)
        path.addLine(to: convertedBottomRight)
        path.addLine(to: convertedBottomLeft)
        path.close()
        
        path.lineWidth = 2.0
        return path
    }
}

之后，我再应用一次缩放变换以适应显示轮廓的视图大小。

我正在使用https://github.com/maxvol/RxVision 库，这使得沿线传递处理后的图像（在我的例子中为 CVPixelBuffer）变得简单。

【讨论】：