【问题标题】:BatchNormalization layer constructingBatchNormalization 层构建
【发布时间】:2020-05-07 00:38:53
【问题描述】:

我正在尝试在 C++ 中设置 BatchNormalization 层。

我的代码如下所示:

mx::Symbol loadBatchNormalization(mx::Symbol previous, std::istream &file, const std::string &name, const Shape &inputShape, const Shape &outputShape, std::map<std::string, mx::NDArray> &args, bool tensorflow, bool debug)
{
    auto gammaShape_ = ReadShape(file);
    auto gamma_ = ReadFloats(file, sizeOf(gammaShape_));
    auto gammaShape = shape_(gammaShape_);
    mx::NDArray gamma { gamma_, gammaShape, ctx };

    auto betaShape_ = ReadShape(file);
    auto beta_ = ReadFloats(file, sizeOf(betaShape_));
    auto betaShape = shape_(betaShape_);
    mx::NDArray beta { beta_, betaShape, ctx };

    auto movingMeanShape_ = ReadShape(file);
    auto movingMean_ = ReadFloats(file, sizeOf(movingMeanShape_));
    auto movingMeanShape = shape_(movingMeanShape_);
    mx::NDArray movingMean { movingMean_, movingMeanShape, ctx };

    auto movingVarianceShape_ = ReadShape(file);
    auto movingVariance_ = ReadFloats(file, sizeOf(movingVarianceShape_));
    auto movingVarianceShape = shape_(movingVarianceShape_);
    mx::NDArray movingVariance { movingVariance_, movingVarianceShape, ctx };

    mx::Symbol gammaSymbol(name + "_gamma");
    mx::Symbol betaSymbol(name + "_beta");
    mx::Symbol movingMeanSymbol(name + "_movingMean");
    mx::Symbol movingVarianceSymbol(name + "_movingVariance");

    double eps = 0.001;
    mx_float momentum = 0.9; // should never be used?
    bool fix_gamma = false;
    bool use_global_stats = false;
    bool output_mean_var = false;
    int axis = 1;
    bool cudnn_off = false;

    mx::Symbol layer = mx::BatchNorm(
        name,
        previous,
        gammaSymbol,
        betaSymbol,
        movingMeanSymbol,
        movingVarianceSymbol,
        eps,
        momentum,
        fix_gamma,
        use_global_stats,
        output_mean_var,
        axis,
        cudnn_off
    );

    args[name + "_gamma"] = gamma;
    args[name + "_beta"] = beta;
    args[name + "_movingMean"] = movingMean;
    args[name + "_movingVariance"] = movingVariance;

    return layer;
}

简而言之,制作 gamma、beta、movingMean 和 MovingVariance,并使用这些符号制作 BatchNorm。

但是。 BatchNorm 层正在输出零。这让我觉得我需要做点别的。

谁能给我一些线索来构建具有先前训练的权重的 BatchNorm 层?

【问题讨论】:

    标签: c++ mxnet batch-normalization batchnorm


    【解决方案1】:

    截至 2020 年 1 月 23 日。如果使用来自 keras mxnet 训练网络的 gamma、beta、movingMean 和 MovingVariance 构建,Mxnet Batchnorm 似乎无法正常工作。

    查看 keras 源代码,了解他们的 batchnorm 进行预测。

    一个可能的解决方案是这样的:

    mx::Symbol generateBatchNormalization (const std::string &name, mx::Symbol &inputSymbol_, mx::Symbol &gammaSymbol, mx::Symbol &betaSymbol, mx::Symbol &movingMeanSymbol, mx::Symbol &movingVarianceSymbol)
    {
        //  auto normalization = (inputSymbol - movingMeanSymbol) / mx::sqrt(movingVarianceSymbol + eps) * gammaSymbol + betaSymbol;
    
        auto inputSymbol = mx::SwapAxis(inputSymbol_, 1, 3);
        auto n0 = mx::broadcast_sub(inputSymbol, movingMeanSymbol);
        double epsilon = 0.0001;
        auto n1 = mx::sqrt(movingVarianceSymbol + epsilon);
        auto n2 = mx::broadcast_div(n0, n1);
        auto n3 = mx::broadcast_mul(n2, gammaSymbol);
        auto n4 = mx::broadcast_add(n3, betaSymbol);
    
        auto normalization = mx::SwapAxis(n4, 1, 3);
        return normalization;
    }
    

    【讨论】:

      猜你喜欢
      • 2018-01-08
      • 2019-03-04
      • 2021-03-16
      • 1970-01-01
      • 2019-11-04
      • 2021-03-17
      • 1970-01-01
      • 2017-07-20
      • 1970-01-01
      相关资源
      最近更新 更多