Keras 中的自定义损失函数（R 语言）答案

【问题标题】：Custom Loss Function in Keras (R language)Keras 中的自定义损失函数（R 语言）
【发布时间】：2018-06-13 07:29:16
【问题描述】：

我必须使用 Keras 在 R 中创建自定义损失函数。我在 R 语言中的损失函数是微不足道的：

lossTradingReturns = function(y_true, y_pred) {
  y_true_diff = na.trim(diff(log(y_true)))
  y_pred_diff = na.trim(diff(log(y_pred)))
  sum( -(sign(y_pred_diff) * y_true_diff) )
}

我在 Keras for R 中翻译如下：

lossTradingReturns = function(y_true, y_pred) {
   y_true_log = k_log(y_true)
   y_pred_log = k_log(y_pred)
   y_true_diff = y_true_log[2:batch_size] - y_true_log[1:(batch_size-1)]
   y_pred_diff = y_pred_log[2:batch_size] - y_pred_log[1:(batch_size-1)]
   y_true_diff = k_reshape(y_true_diff, (batch_size-1))
   y_pred_diff = k_reshape(y_pred_diff, (batch_size-1))
   return (k_sum( -(k_sign(y_pred_diff) * y_true_diff) ))
}

我的函数进行差分 (y_t - y_t0)，所以我从 1024 个元素 (batch_size) 开始，但最后我只有 1023 个元素来计算回报。错误消息表明它需要 1024，我不明白为什么：该函数必须只返回一个标量...

无论如何，如果我错了，并且函数输出必须是 1024 张量是正确的，我该如何扩展我的 1023 张量并添加一个零值？

提前致谢

运行时的错误信息：

Error in py_call_impl(callable, dots$args, dots$keywords) :
InvalidArgumentError: Input to reshape is a tensor with 1024 values, but the requested shape has 1023
[[Node: loss_19/dense_2_loss/Reshape = Reshape[T=DT_FLOAT, Tshape=DT_INT32, _device="/job:localhost/replica:0/task:0/device:CPU:0"](loss_19/dense_2_loss/Sub, loss_19/dense_2_loss/Reshape_1/shape)]]

Caused by op u'loss_19/dense_2_loss/Reshape', defined at:
File "/home/peroni/.virtualenvs/r-tensorflow/lib/python2.7/site-packages/keras/models.py", line 863, in compile
**kwargs)
File "/home/peroni/.virtualenvs/r-tensorflow/lib/python2.7/site-packages/keras/engine/training.py", line 830, in compile
sample_weight, mask)

在收到 cmets 后澄清我的策略：批次样本在设计上是连续的，但我正在检查以确保这一点！（感谢您的建议）。这是我用来选择它们的函数（shuffle=FALSE）。也许你可以向我确认。

# data — The original array of floating-point data, which you normalized in listing 6.32.
# lookback — How many timesteps back the input data should go.
# delay — How many timesteps in the future the target should be.
# min_index and max_index — Indices in the data array that delimit which timesteps to draw from. This is useful for keeping a segment of the data for validation and another for testing.
# shuffle — Whether to shuffle the samples or draw them in chronological order.
# batch_size — The number of samples per batch.
# step — The period, in timesteps, at which you sample data. You’ll set it 6 in order to draw one data point every hour.
generator = function(data, Y.column=1, lookback, delay, min_index, max_index, shuffle = FALSE, batch_size = 128, step = 6, is_test = FALSE) {
  if (is.null(max_index))
    max_index <- nrow(data) - delay - 1
  i <- min_index + lookback
  function() {
    if (shuffle) {
      rows <- sample(c((min_index+lookback):max_index), size = batch_size)
    } else {
      if (i + batch_size >= max_index)
        i <<- min_index + lookback
      rows <- c(i:min(i+batch_size-1, max_index))
  i <<- i + length(rows)
    }
    samples <- array(0, dim = c(length(rows), 
                                lookback / step,
                                dim(data)[[-1]]-1))
    targets <- array(0, dim = c(length(rows)))

    for (j in 1:length(rows)) {
      indices <- seq(rows[[j]] - lookback, rows[[j]]-1, 
                     length.out = dim(samples)[[2]])
      samples[j,,] <- data[indices, -Y.column]
      targets[[j]] <- data[rows[[j]] + delay, Y.column]
    }            

    if (!is_test)
      return (list(samples, targets))
    else
      return (list(samples))
  }
}

我之前评估了微分信号的假设（使它们静止），但这极大地改变了我的 NN 工作策略，并导致性能非常低...

【问题讨论】：

标签： python r tensorflow keras finance

【解决方案1】：

错误发生在k_reshape。但是，您的损失函数不需要此重塑步骤，因为如果您离开 axis=NULL，k_sum 会在张量的所有元素上取平均值。

以下损失函数对我来说很好用：

lossTradingReturns = function(y_true, y_pred) {
   y_true_log = k_log(y_true)
   y_pred_log = k_log(y_pred)
   y_true_diff = y_true_log[2:batch_size] - y_true_log[1:(batch_size-1)]
   y_pred_diff = y_pred_log[2:batch_size] - y_pred_log[1:(batch_size-1)]
   return (k_sum( -(k_sign(y_pred_diff) * y_true_diff) ))
}

但是，这个损失函数对我来说看起来很奇怪。数据集在训练期间被打乱，因此每个时期的小批量都不相同。所以在损失函数中取 y 的差异是没有意义的，因为减去的观察是完全随机的。您不应该在训练模型之前立即区分整个数据集中的 y 变量吗？

【讨论】：

皮埃尔，感谢您的回答。不幸的是，您建议的功能不起作用。它返回以下错误： py_call_impl(callable, dots$args, dots$keywords) 中的错误：InvalidArgumentError: Incompatible shapes: [1023,1] vs. [1024,1] [[Node: loss/dense_1_loss/Mul = Mul[ T=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"](loss/dense_1_loss/Sign, loss/dense_1_loss/Sub)]] op u'loss/dense_1_loss引起/Mul'，定义在：文件“/home/peroni/.virtualenvs/r-tensorflow/lib/python2.7/site-packages/keras/models.py”，第863行，编译**kwargs）
澄清我的策略：批次样本在设计上是连续的（请参阅我编辑的原始帖子），但我正在检查以确保这一点！（感谢您的建议）。我之前评估了微分信号的假设（使它们静止），但这极大地改变了我的神经网络工作策略，它导致性能非常低......