Yolov3 没有检测到任何东西，但 Yolov2 工作正常答案

【问题标题】：Yolov3 don't detect anything but Yolov2 works fineYolov3 没有检测到任何东西，但 Yolov2 工作正常
【发布时间】：2019-01-24 10:07:46
【问题描述】：

我的应用程序的目的是检测人类。如果我正在加载 YOLOv2 权重和配置，一切都很好。如果我正在加载 YOLOv3 权重和配置，net 在所有边界框上的所有类置信度上返回 0。我从https://pjreddie.com/darknet/yolo/ 尝试了普通的 YOLOv3-416 和 YOLOv3-tiny。据我所知，YOLOv2 和 YOLOv3 上所需的输入和输出是相同的。请帮我找出我做错了什么，YOLOv3 不起作用。我正在使用 OpenCV 4.01 和 Java 包装器。我只使用CPU。我试图找到类似的问题，但我没有找到类似的东西。

public class YoloAnalizer {
private Net net;
private StopWatch stopWatch = new StopWatch();
private Logger logger = LogManager.getLogger();

private final double threshold = 0.5;
private final double scaleFactor = 1.0 / 255.000;
private final Size imageSize = new Size(416, 416);
private final Scalar mean = new Scalar(0,0,0);
private final boolean swapRB = true;
private final boolean crop = false;

private final String[] classes = new String[] {"person", "bicycle", "car", "motorcycle",
                                             "airplane", "bus", "train", "truck", "boat", "traffic light", "fire hydrant",
                                             "stop sign", "parking meter", "bench", "bird", "cat", "dog", "horse",
                                             "sheep", "cow", "elephant", "bear", "zebra", "giraffe", "backpack",
                                             "umbrella", "handbag", "tie", "suitcase", "frisbee", "skis",
                                             "snowboard", "sports ball", "kite", "baseball bat", "baseball glove", "skateboard",
                                             "surfboard", "tennis racket", "bottle", "wine glass", "cup", "fork", "knife",
                                             "spoon", "bowl", "banana", "apple", "sandwich", "orange", "broccoli", "carrot", "hot dog",
                                             "pizza", "donut", "cake", "chair", "couch", "potted plant", "bed", "dining table",
                                             "toilet", "tv", "laptop", "mouse", "remote", "keyboard",
                                             "cell phone", "microwave", "oven", "toaster", "sink", "refrigerator",
                                 "book", "clock", "vase", "scissors", "teddy bear", "hair drier", "toothbrush"};

public YoloAnalizer(String pathToYoloDarknetConfig, String pathToYoloDarknetWeights) {
    net = Dnn.readNetFromDarknet(pathToYoloDarknetConfig, pathToYoloDarknetWeights);
}

public List<Rect> AnalizeImage(Mat image) {
    logger.debug("Starting analisic image using yolo");
    stopWatch.StartTime();
    Mat blob = Dnn.blobFromImage(image, scaleFactor, imageSize, mean, swapRB, crop);
    net.setInput(blob);

    Mat prediction = net.forward();
    List<Rect> rects = ConvertPredictionToRoundingBox(prediction, image);
    logger.debug(String.format("Analising frame took: %s", stopWatch.GetElapsedMiliseconds()));
    return rects;
}

private List<Rect> ConvertPredictionToRoundingBox(Mat prediction, Mat image) {
    List<Rect> listOfPredictedObjects = new ArrayList<>();
    for (int i = 0; i < prediction.size().height; i++) {
        float[] row = new float[85];
        prediction.get(i, 0, row);

        float confidenceOnBox = row[4];
        int predictedClassConfidence = getTableIndexWithMaxValue(row, 5);
        double score = confidenceOnBox * row[predictedClassConfidence];
        if (score > threshold) {
            double x_center   = row[0] * image.width();
            double y_center   = row[1] * image.height();
            double width = row[2] * image.width();
            double height = row[3] * image.height();

            double left  = x_center - width * 0.5;
            double top  = y_center - height * 0.5;

            listOfPredictedObjects.add(new Rect((int)left, (int)top, (int)width, (int)height));
            logger.info(String.format("Found %s(%s) with confidence %s", classes[predictedClassConfidence-5],predictedClassConfidence, score));
        }
    }
    return listOfPredictedObjects;
}

private int getTableIndexWithMaxValue(float[] array, int startFrom) {
    double maxValue = -1;
    int maxIndex = -1;
    for (int i = startFrom; i < array.length; i++) {
        if (maxValue < array[i]) {
            maxIndex = i;
            maxValue = array[i];
        }
    }
    return maxIndex;
}

}

【问题讨论】：

标签： java opencv yolo

【解决方案1】：

这是我在 v3 中发现的：

在函数fill_truth_region：

真值表的创建格式为“1-classes-x-y-w-h”，即真值表中的每个条目是1+类数+4。

但在forward_yolo_layer 函数中，似乎得到盒子真相需要 x,y,w,h 从条目的开头开始，如果有条目，则 x 似乎为 1，然后将类的部分变为 y,w,h。

我想如果你在 forward_yolo_layer 中改变这个：

box truth=float_to_box(net.truth + t * 5 + b * l.truths, 1);

到这里：

box truth=float_to_box(net.truth + t * (5+l.classes) + b * l.truths + l.classes+1, 1);

然后你会得到一个带有正确 x,y,w,h 的真值框。

【讨论】：