如何使用非常大的训练集训练 dlib 形状预测器答案

【问题标题】：How can I train dlib shape predictor using a very large training set如何使用非常大的训练集训练 dlib 形状预测器
【发布时间】：2019-06-06 22:19:09
【问题描述】：

我正在尝试使用 python dlib.train_shape_predictor 函数来训练使用非常大的图像集（约 50,000 个）。

我创建了一个包含必要数据的 xml 文件，但似乎 train_shape_predictor 在开始训练之前将所有引用的图像加载到 RAM 中。这会导致进程终止，因为它使用了超过 100gb 的 RAM。即便缩减数据集也使用超过 20gb（机器只有 16gb 物理内存）。

有没有办法让 train_shape_predictor 按需加载图像，而不是一次加载？

我在 macOS 上使用通过 pip 安装的 python 3.7.2 和 dlib 19.16.0。

【问题讨论】：

【解决方案1】：

我将此作为问题发布在 dlib github 上，并得到了作者的回复：

像这样更改代码在磁盘和内存之间来回循环是不合理的。这会使训练变得非常缓慢。相反，您应该购买更多 RAM，或者使用更小的图像。

按照设计，大型训练集需要大量 RAM。

【讨论】：

你试过了吗：medium.com/@Petuum/…
或者将此参数设置为 0 upsample_limit train_simple_object_detector() will upsample images if needed no more than upsample_limit times. Value 0 will forbid trainer to upsample any images. If trainer is unable to fit all boxes with required upsample_limit, exception will be thrown. Higher values of upsample_limit exponentially increases memory requirements. Values higher than 2 (default) are not recommended.
@thachnb 谢谢，但看起来 upsample_limit 只是 train_simple_object_detector 的一个选项，而不是 train_shape_predictor。在我的情况下，缩放训练集中的图像非常快速和容易。它减少了大约 70% 的内存需求，我仍然从训练中得到了很好的结果。
70% 是一个令人印象深刻的数字！图像大小之前/之后如何？
@thachnb 我使用 PIL 的缩略图方法将所有宽度或高度 > 1024 的图像缩放到 1024（然后缩放 xml 中的地标坐标）。许多图像至少是那个大小的两倍。由于我的应用程序在 720p 视频中寻找面部标志，因此它似乎不会对准确性产生负面影响。