【问题标题】:R keras implementation CNN multi-task must sum to 100R keras 实现 CNN 多任务总和必须为 100
【发布时间】:2018-09-27 01:36:10
【问题描述】:

我在 R 中使用 Keras 制作了一个模型。我想用一些共享层进行多任务回归,然后是一系列全连接层,最后一个大小为 1 的全连接层,对应于最终预测.

现在假设我有三个输出 Y1、Y2、Y3。我希望输出 Y1 和 Y2 总和为 100,而每个输出必须有自己的损失函数(我想对观察值应用权重)。

我已经建立了我的模型,并且当我不添加 sum(Y1+Y2) = 100 的约束时它运行良好,但我无法使其与约束一起工作。我尝试使用 softmax 层,但每个输出返回 1。

我提供了图表和一些示例代码。这确实是一个实现问题,因为我认为这是可能的(并且使用 softmax 可能很容易)。

base.model <- keras_model_sequential()
input <- layer_input(shape=c(NULL, 3,6,6))

base.model <- input %>%
   layer_conv_2d(filter = 64, kernel_size = c(3,3), input_shape = c(NULL, 3,6,6), padding='same',data_format='channels_first' ) %>%
   layer_activation("relu") %>%
   layer_max_pooling_2d(pool_size = c(2,2)) %>%
   layer_conv_2d(filter = 20, kernel_size = c(2,2), padding = "same", activation = "relu") %>%
   layer_dropout(0.4) %>%
   layer_flatten()

# add outputs
Y1 <- base.model %>% 
   layer_dense(units = 40) %>%
   layer_dropout(rate = 0.3) %>% 
   layer_dense(units = 50) %>%
   layer_dropout(rate = 0.3) %>% 
   layer_dense(units = 1, name="Y1")

# add outputs
Y2 <- base.model %>% 
   layer_dense(units = 40) %>%
   layer_dropout(rate = 0.3) %>% 
   layer_dense(units = 50) %>%
   layer_dropout(rate = 0.3) %>% 
   layer_dense(units = 1, name="Y2")

# add outputs
Y3 <- base.model %>% 
   layer_dense(units = 40) %>%
   layer_dropout(rate = 0.3) %>% 
   layer_dense(units = 50) %>%
   layer_dropout(rate = 0.3) %>% 
   layer_dense(units = 1, name="Y3")

base.model <- keras_model(input,list(Y1,Y2,Y3)) %>%
compile(
  loss = "mean_squared_error",
  optimizer = 'adam',
  loss_weights=list(Y1=1.0, Y2=1.0, Y3=1.0)
)

history <- base.model %>% fit(
x = Xtrain, 
y = list(Y1 = Ytrain.y1, Y2 = Ytrain.y2, Y3 = Ytrain.y3),
epochs = 500, batch_size = 500,
sample_weights = list(Y1= data$weigth.y1[sp_new], Y2= data$weigth.y2[sp_new] Y3= data$weigth.y3[sp_new]),
validation_split = 0.2)

大致的思路可以用一张图来概括: https://www.dropbox.com/s/ueclq42of46ifig/graph%20CNN.JPG?dl=0

现在,如果我尝试使用 softmax 层,我会这样做:

soft.l <- layer_dense(units = 1, activation = 'softmax')

Y11 <- Y1 %>% soft.l %>% layer_dense(units = 1, name="Y11", trainable = T)
Y22 <- Y2 %>% soft.l %>% layer_dense(units = 1, name="Y11", trainable = T)

那么就变成了:

base.model <- keras_model(input,list(Y11,Y22,Y3)) %>%
compile(
  loss = "mean_squared_error",
  optimizer = 'adam',
  loss_weights=list(Y11=1.0, Y22=1.0, Y3=1.0)
)

history <- base.model %>% fit(
x = Xtrain, 
y = list(Y11 = Ytrain.y1, Y22 = Ytrain.y2, Y3 = Ytrain.y3),
epochs = 500, batch_size = 500,
sample_weights = list(Y11= data$weigth.y1[sp_new], Y22= data$weigth.y2[sp_new] Y3= data$weigth.y3[sp_new]),
validation_split = 0.2)

(base.model %>% predict(Xtest))[[1]] + (base.model %>% predict(Xtest))[[2]] 

问题是预测的 sum(Y11+Y22) 与 1 不同。我做错了什么?

【问题讨论】:

    标签: r tensorflow keras deep-learning


    【解决方案1】:

    我分享可能真正帮助他人的答案。使用连接层和softmax激活函数的解决方案很容易,这使得所有层的输出总和为1:

    # same first part as before
    base.model <- keras_model_sequential()
    input <- layer_input(shape=c(NULL, 3,6,6))
    
    base.model <- input %>%
        layer_conv_2d(filter = 64, kernel_size = c(3,3), input_shape = c(NULL, 3,6,6), padding='same', data_format='channels_first' ) %>%
        layer_activation("relu") %>%
        layer_max_pooling_2d(pool_size = c(2,2)) %>%
        layer_conv_2d(filter = 20, kernel_size = c(2,2), padding = "same", activation = "relu") %>%
        layer_dropout(0.4) %>%
        layer_flatten()
    
    # add outputs
    Y1 <- base.model %>% 
        layer_dense(units = 40) %>%
        layer_dropout(rate = 0.3) %>% 
        layer_dense(units = 50) %>%
        layer_dropout(rate = 0.3) %>% 
        layer_dense(units = 1, name="Y1")
    
    # add outputs
    Y2 <- base.model %>% 
        layer_dense(units = 40) %>%
        layer_dropout(rate = 0.3) %>% 
        layer_dense(units = 50) %>%
        layer_dropout(rate = 0.3) %>% 
        layer_dense(units = 1, name="Y2")
    
    # add outputs
    Y3 <- base.model %>% 
        layer_dense(units = 40) %>%
        layer_dropout(rate = 0.3) %>% 
        layer_dense(units = 50) %>%
        layer_dropout(rate = 0.3) %>% 
        layer_dense(units = 1, name="Y3")
    
    ## NEW
    # add a layer that brings together Y1 and Y2
    combined <- layer_concatenate(c(Y1, Y2)) %>% layer_activation_softmax(name= 'combined')
    
    
    base.model <- keras_model(input,list(combined,Y3)) %>% compile(
        loss = "mean_squared_error",
        optimizer = 'adam',
        loss_weights=list(combined = c(1.0,1.0), Y3=1.0)
    )
    
    history <- base.model %>% fit(
        x = Xtrain, 
        y = list(combined = cbind(Ytrain.y1, Ytrain.y2), Y3 = Ytrain.y3),
        epochs = 500, batch_size = 500,
        sample_weights = list(combined = cbind(data$weigth.y1[sp_new], data$weigth.y2[sp_new]) Y3= 
        data$weigth.y3[sp_new]),
        validation_split = 0.2)
    

    【讨论】:

      猜你喜欢
      • 2017-03-12
      • 2019-07-02
      • 2019-09-14
      • 2019-01-21
      • 1970-01-01
      • 2021-06-06
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      相关资源
      最近更新 更多