如何从 VNClassificationObservation 获取对象矩形/坐标答案

【问题标题】：How to get object rect/coordinates from VNClassificationObservation如何从 VNClassificationObservation 获取对象矩形/坐标
【发布时间】：2017-11-26 01:41:39
【问题描述】：

从VNClassificationObservation 获取时遇到问题。

我的目标 id 是识别对象并显示带有对象名称的弹出窗口，我可以获取名称但无法获取对象坐标或框架。

代码如下：

let handler = VNImageRequestHandler(cvPixelBuffer: pixelBuffer, options: requestOptions)
do {
    try handler.perform([classificationRequest, detectFaceRequest])
} catch {
    print(error)
}

那我处理

func handleClassification(request: VNRequest, error: Error?) {
      guard let observations = request.results as? [VNClassificationObservation] else {
          fatalError("unexpected result type from VNCoreMLRequest")
      }

    // Filter observation
    let filteredOservations = observations[0...10].filter({ $0.confidence > 0.1 })

    // Update UI
   DispatchQueue.main.async { [weak self] in

    for  observation in filteredOservations {
            print("observation: ",observation.identifier)
            //HERE: I need to display popup with observation name
    }
  }
}

更新：

lazy var classificationRequest: VNCoreMLRequest = {

    // Load the ML model through its generated class and create a Vision request for it.
    do {
        let model = try VNCoreMLModel(for: Inceptionv3().model)
        let request = VNCoreMLRequest(model: model, completionHandler: self.handleClassification)
        request.imageCropAndScaleOption = VNImageCropAndScaleOptionCenterCrop
        return request
    } catch {
        fatalError("can't load Vision ML model: \(error)")
    }
}()

【问题讨论】：

标签： ios swift image-recognition ios11 coreml

【解决方案1】：

纯分类器模型只能回答“这是什么图片？”，不能检测和定位图片中的物体。所有free models on the Apple developer site（包括Inception v3）都是这种类型。

当 Vision 使用此类模型时，它会根据 MLModel 文件中声明的输出将模型识别为分类器，并返回 VNClassificationObservation 对象作为输出。

如果您找到或创建了一个经过训练可以识别和定位对象的模型，您仍然可以将它与 Vision 一起使用。当您将该模型转换为 Core ML 格式时，MLModel 文件将描述多个输出。当 Vision 使用具有多个输出的模型时，它会返回一个由 VNCoreMLFeatureValueObservation 对象组成的数组——一个用于模型的每个输出。

模型如何声明其输出将决定哪些特征值代表什么。报告分类和边界框的模型可以输出一个字符串和四个双精度数，或者一个字符串和一个多数组等。

附录：这是一个适用于 iOS 11 并返回 VNCoreMLFeatureValueObservation 的模型：TinyYOLO

【讨论】：

您能推荐一个提供 VNCoreMLFeatureValueObservation 结果的特定模型吗？

【解决方案2】：

这是因为分类器不返回对象坐标或框架。分类器仅给出类别列表的概率分布。

你在这里使用什么型号？

【讨论】：

我使用的是 Inceptionv3().model，看起来我无法获取坐标。
那是因为 Inception-v3 没有给你坐标，只有类名的字典和这些类的概率。

【解决方案3】：

为了跟踪和识别对象，您必须使用暗网创建自己的模型。我遇到了同样的问题，并使用 TuriCreate 来训练模型，而不仅仅是向框架提供图像，您还必须为框架提供边界框。 Apple 在此处记录了如何创建这些模型： Apple TuriCreate docs

【讨论】：