提高 VNDetectHumanBodyPoseRequest 的身体跟踪性能答案

【问题标题】：Improve body tracking performance of VNDetectHumanBodyPoseRequest提高 VNDetectHumanBodyPoseRequest 的身体跟踪性能
【发布时间】：2020-11-03 06:05:54
【问题描述】：

我正在尝试通过VNDetectHumanBodyPoseRequest 的身体跟踪来提高绘制骨骼的性能，即使在 5 米之外，并且使用稳定的 iPhone XS 摄像头。

跟踪对我身体右下肢的置信度低，滞后明显，有抖动。我无法复制今年 WWDC demo video 中展示的性能。

这里是相关代码，改编自Apple's sample code：

class Predictor {
  func extractPoses(_ sampleBuffer: CMSampleBuffer) throws -> [VNRecognizedPointsObservation] {
    let requestHandler = VNImageRequestHandler(cmSampleBuffer: sampleBuffer, orientation: .down)
    
    let request = VNDetectHumanBodyPoseRequest()
    
    do {
      // Perform the body pose-detection request.
      try requestHandler.perform([request])
    } catch {
      print("Unable to perform the request: \(error).\n")
    }
    
    return (request.results as? [VNRecognizedPointsObservation]) ?? [VNRecognizedPointsObservation]()
  }
}

我已捕获视频数据并在此处处理示例缓冲区：

class CameraViewController: AVCaptureVideoDataOutputSampleBufferDelegate {

  func captureOutput(_ output: AVCaptureOutput, didOutput sampleBuffer: CMSampleBuffer, from connection: AVCaptureConnection) {
    let observations = try? predictor.extractPoses(sampleBuffer)
    observations?.forEach { processObservation($0) }
  }

  func processObservation(_ observation: VNRecognizedPointsObservation) {
    
    // Retrieve all torso points.
    guard let recognizedPoints =
            try? observation.recognizedPoints(forGroupKey: .all) else {
      return
    }
    
    let storedPoints = Dictionary(uniqueKeysWithValues: recognizedPoints.compactMap { (key, point) -> (String, CGPoint)? in
      return (key.rawValue, point.location)
    })
    
    DispatchQueue.main.sync {
      let mappedPoints = Dictionary(uniqueKeysWithValues: recognizedPoints.compactMap { (key, point) -> (String, CGPoint)? in
        guard point.confidence > 0.1 else { return nil }
        let norm = VNImagePointForNormalizedPoint(point.location,
                                                  Int(drawingView.bounds.width),
                                                  Int(drawingView.bounds.height))
        return (key.rawValue, norm)
      })
      
      let time = 1000 * observation.timeRange.start.seconds
      
      
      // Draw the points onscreen.
      DispatchQueue.main.async {
        self.drawingView.draw(points: mappedPoints)
      }
    }
  }
}

drawingView.draw 函数用于在相机视图顶部自定义UIView，并使用CALayer 子层绘制点。 AVCaptureSession 代码与示例代码here 完全相同。

我尝试使用VNDetectHumanBodyPoseRequest(completionHandler:) 变体，但这对我的性能没有影响。不过，我可以尝试使用移动平均滤波器进行平滑处理。但异常值预测仍然存在问题，非常不准确。

我错过了什么？

【问题讨论】：

看起来你无法解决它？有更新吗？
您好，我遇到了类似的问题，对此我很感兴趣。您找到可以分享的解决方案了吗？谢谢

标签： ios swift avfoundation vision avkit

【解决方案1】：

我认为这是 iOS 14 beta v1-v3 上的一个错误。升级到 v4 和更高的 beta 版本后，跟踪会好很多。随着最新的 beta 更新，API 也变得更加清晰，细粒度的类型名称。

请注意，我没有从 Apple 那里得到有关此错误的官方答复，但此问题可能会在 iOS 14 官方版本中完全消失。

【讨论】：

我还实现了一个移动平均过滤器，但这让事情变得更糟，因为它消除/平滑了快速移动。可能加权移动平均滤波器会更好