【问题标题】:Custom Loss Function in Keras (R language)Keras 中的自定义损失函数(R 语言)
【发布时间】:2018-06-13 07:29:16
【问题描述】:

我必须使用 Keras 在 R 中创建自定义损失函数。 我在 R 语言中的损失函数是微不足道的:

lossTradingReturns = function(y_true, y_pred) {
  y_true_diff = na.trim(diff(log(y_true)))
  y_pred_diff = na.trim(diff(log(y_pred)))
  sum( -(sign(y_pred_diff) * y_true_diff) )
}

我在 Keras for R 中翻译如下:

lossTradingReturns = function(y_true, y_pred) {
   y_true_log = k_log(y_true)
   y_pred_log = k_log(y_pred)
   y_true_diff = y_true_log[2:batch_size] - y_true_log[1:(batch_size-1)]
   y_pred_diff = y_pred_log[2:batch_size] - y_pred_log[1:(batch_size-1)]
   y_true_diff = k_reshape(y_true_diff, (batch_size-1))
   y_pred_diff = k_reshape(y_pred_diff, (batch_size-1))
   return (k_sum( -(k_sign(y_pred_diff) * y_true_diff) ))
}

我的函数进行差分 (y_t - y_t0),所以我从 1024 个元素 (batch_size) 开始,但最后我只有 1023 个元素来计算回报。 错误消息表明它需要 1024,我不明白为什么:该函数必须只返回一个标量...

无论如何,如果我错了,并且函数输出必须是 1024 张量是正确的,我该如何扩展我的 1023 张量并添加一个零值?

提前致谢

运行时的错误信息:

Error in py_call_impl(callable, dots$args, dots$keywords) :
InvalidArgumentError: Input to reshape is a tensor with 1024 values, but the requested shape has 1023
[[Node: loss_19/dense_2_loss/Reshape = Reshape[T=DT_FLOAT, Tshape=DT_INT32, _device="/job:localhost/replica:0/task:0/device:CPU:0"](loss_19/dense_2_loss/Sub, loss_19/dense_2_loss/Reshape_1/shape)]]

Caused by op u'loss_19/dense_2_loss/Reshape', defined at:
File "/home/peroni/.virtualenvs/r-tensorflow/lib/python2.7/site-packages/keras/models.py", line 863, in compile
**kwargs)
File "/home/peroni/.virtualenvs/r-tensorflow/lib/python2.7/site-packages/keras/engine/training.py", line 830, in compile
sample_weight, mask)

在收到 cmets 后澄清我的策略:批次样本在设计上是连续的,但我正在检查以确保这一点! (感谢您的建议)。这是我用来选择它们的函数(shuffle=FALSE)。也许你可以向我确认。

# data — The original array of floating-point data, which you normalized in listing 6.32.
# lookback — How many timesteps back the input data should go.
# delay — How many timesteps in the future the target should be.
# min_index and max_index — Indices in the data array that delimit which timesteps to draw from. This is useful for keeping a segment of the data for validation and another for testing.
# shuffle — Whether to shuffle the samples or draw them in chronological order.
# batch_size — The number of samples per batch.
# step — The period, in timesteps, at which you sample data. You’ll set it 6 in order to draw one data point every hour.
generator = function(data, Y.column=1, lookback, delay, min_index, max_index, shuffle = FALSE, batch_size = 128, step = 6, is_test = FALSE) {
  if (is.null(max_index))
    max_index <- nrow(data) - delay - 1
  i <- min_index + lookback
  function() {
    if (shuffle) {
      rows <- sample(c((min_index+lookback):max_index), size = batch_size)
    } else {
      if (i + batch_size >= max_index)
        i <<- min_index + lookback
      rows <- c(i:min(i+batch_size-1, max_index))
  i <<- i + length(rows)
    }
    samples <- array(0, dim = c(length(rows), 
                                lookback / step,
                                dim(data)[[-1]]-1))
    targets <- array(0, dim = c(length(rows)))

    for (j in 1:length(rows)) {
      indices <- seq(rows[[j]] - lookback, rows[[j]]-1, 
                     length.out = dim(samples)[[2]])
      samples[j,,] <- data[indices, -Y.column]
      targets[[j]] <- data[rows[[j]] + delay, Y.column]
    }            

    if (!is_test)
      return (list(samples, targets))
    else
      return (list(samples))
  }
}

我之前评估了微分信号的假设(使它们静止),但这极大地改变了我的 NN 工作策略,并导致性能非常低...

【问题讨论】:

    标签: python r tensorflow keras finance


    【解决方案1】:

    错误发生在k_reshape。但是,您的损失函数不需要此重塑步骤,因为如果您离开 axis=NULLk_sum 会在张量的所有元素上取平均值。

    以下损失函数对我来说很好用:

    lossTradingReturns = function(y_true, y_pred) {
       y_true_log = k_log(y_true)
       y_pred_log = k_log(y_pred)
       y_true_diff = y_true_log[2:batch_size] - y_true_log[1:(batch_size-1)]
       y_pred_diff = y_pred_log[2:batch_size] - y_pred_log[1:(batch_size-1)]
       return (k_sum( -(k_sign(y_pred_diff) * y_true_diff) ))
    }
    

    但是,这个损失函数对我来说看起来很奇怪。数据集在训练期间被打乱,因此每个时期的小批量都不相同。所以在损失函数中取 y 的差异是没有意义的,因为减去的观察是完全随机的。您不应该在训练模型之前立即区分整个数据集中的 y 变量吗?

    【讨论】:

    • 皮埃尔,感谢您的回答。不幸的是,您建议的功能不起作用。它返回以下错误: py_call_impl(callable, dots$args, dots$keywords) 中的错误:InvalidArgumentError: Incompatible shapes: [1023,1] vs. [1024,1] [[Node: loss/dense_1_loss/Mul = Mul[ T=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"](loss/dense_1_loss/Sign, loss/dense_1_loss/Sub)]] op u'loss/dense_1_loss引起/Mul',定义在:文件“/home/peroni/.virtualenvs/r-tensorflow/lib/python2.7/site-packages/keras/models.py”,第863行,编译**kwargs)
    • 澄清我的策略:批次样本在设计上是连续的(请参阅我编辑的原始帖子),但我正在检查以确保这一点! (感谢您的建议)。我之前评估了微分信号的假设(使它们静止),但这极大地改变了我的神经网络工作策略,它导致性能非常低......
    猜你喜欢
    • 2018-12-21
    • 2020-12-19
    • 2017-12-18
    • 2020-03-27
    • 1970-01-01
    • 2018-10-28
    • 2017-12-29
    • 2018-11-12
    • 1970-01-01
    相关资源
    最近更新 更多