【问题标题】:Reshaping a 2D matrix to a 3D matrix with lag for Keras将 2D 矩阵重塑为具有 Keras 滞后的 3D 矩阵
【发布时间】:2020-06-02 13:52:28
【问题描述】:

我正在尝试在 Keras 中创建一个 LSTM,但我无法重塑输入数据。

让我们考虑 3 个特征的 25 个观察结果:

x <- 1:25
y <- seq(100, 2500, by = 100)
z <- seq(1000, 25000, by = 1000)

my.matrix <- data.matrix(data.frame(x, y, z))
str(my.matrix)

这给出了:

> str(my.matrix)
 num [1:25, 1:3] 1 2 3 4 5 6 7 8 9 10 ...
 - attr(*, "dimnames")=List of 2
  ..$ : NULL
  ..$ : chr [1:3] "x" "y" "z"

还有:

> my.matrix
       x    y     z
 [1,]  1  100  1000
 [2,]  2  200  2000
 [3,]  3  300  3000
 [4,]  4  400  4000
 [5,]  5  500  5000
 [6,]  6  600  6000
 [7,]  7  700  7000
 [8,]  8  800  8000
 [9,]  9  900  9000
[10,] 10 1000 10000
[11,] 11 1100 11000
[12,] 12 1200 12000
[13,] 13 1300 13000
[14,] 14 1400 14000
[15,] 15 1500 15000
[16,] 16 1600 16000
[17,] 17 1700 17000
[18,] 18 1800 18000
[19,] 19 1900 19000
[20,] 20 2000 20000
[21,] 21 2100 21000
[22,] 22 2200 22000
[23,] 23 2300 23000
[24,] 24 2400 24000
[25,] 25 2500 25000

现在我需要创建一个 3D 矩阵,其尺寸为:[nb.observations, window.width, features]。在我的例子中: [25, 5, 3] 例如,其中 window.width=5 是观察滚动窗口的宽度。

编辑:实际上,由于滚动窗口宽度(x 特征的最后一个样本将是 [例如 21、22、23、24、25])。

我尝试做的是:

window.width <- 5
tmp <- NULL
for(i in 1:(dim(my.matrix)[1] - window.width + 1)) {
  s <- i - 1 + window.width
  tmp <- rbind(tmp, my.matrix[i:s,])
}

我们有:

> head(tmp, 10)
      x   y    z
 [1,] 1 100 1000
 [2,] 2 200 2000
 [3,] 3 300 3000
 [4,] 4 400 4000
 [5,] 5 500 5000
 [6,] 2 200 2000
 [7,] 3 300 3000
 [8,] 4 400 4000
 [9,] 5 500 5000
[10,] 6 600 6000

这正是我想要的。如果我们查看x 特征,则有从 1 到 5 的第一个窗口,然后是从 2 到 6 的第二个窗口,依此类推。所有特征都相同。

现在,我需要重塑 tmp 矩阵:

result <- array(tmp, dim=c(dim(my.matrix)[1] - window.width + 1, window.width, dim(my.matrix)[2]))

但这不起作用:

> result[1, ,1]
[1]  1  6 11 16 21

我期待:

> result[1, ,1]
[1]  1  2 3 4 5


> result[2, ,1]
[1]  2  3 4 5 6

我也尝试使用 lag 函数来替换 for 循环,但它也不起作用:

result <- array(data = lag(my.matrix, window.width)[-(1:window.width), ], dim = c(dim(my.matrix)[1] - window.width, window.width, 3))

> result[1, ,1]
[1]    1  100 1000    1  100

1) 我做错了什么,如何获得预期的结果?

2) 此外,for 循环似乎不能很好地扩展。它做了我想做的事,但是有了更多的数据,它变得非常慢(我尝试了 150,000 次观察和 23 个特征)。会有更快的替代方案吗?

编辑:实际上,for 循环几乎可以使用

result <- array(tmp, dim=c(5, 21, 3))

矩阵值是正确的,但是维度都混淆了......

> result
, , 1

     [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [,13] [,14] [,15] [,16] [,17] [,18] [,19] [,20] [,21]
[1,]    1    2    3    4    5    6    7    8    9    10    11    12    13    14    15    16    17    18    19    20    21
[2,]    2    3    4    5    6    7    8    9   10    11    12    13    14    15    16    17    18    19    20    21    22
[3,]    3    4    5    6    7    8    9   10   11    12    13    14    15    16    17    18    19    20    21    22    23
[4,]    4    5    6    7    8    9   10   11   12    13    14    15    16    17    18    19    20    21    22    23    24
[5,]    5    6    7    8    9   10   11   12   13    14    15    16    17    18    19    20    21    22    23    24    25

, , 2

     [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [,13] [,14] [,15] [,16] [,17] [,18] [,19] [,20] [,21]
[1,]  100  200  300  400  500  600  700  800  900  1000  1100  1200  1300  1400  1500  1600  1700  1800  1900  2000  2100
[2,]  200  300  400  500  600  700  800  900 1000  1100  1200  1300  1400  1500  1600  1700  1800  1900  2000  2100  2200
[3,]  300  400  500  600  700  800  900 1000 1100  1200  1300  1400  1500  1600  1700  1800  1900  2000  2100  2200  2300
[4,]  400  500  600  700  800  900 1000 1100 1200  1300  1400  1500  1600  1700  1800  1900  2000  2100  2200  2300  2400
[5,]  500  600  700  800  900 1000 1100 1200 1300  1400  1500  1600  1700  1800  1900  2000  2100  2200  2300  2400  2500

, , 3

     [,1] [,2] [,3] [,4] [,5]  [,6]  [,7]  [,8]  [,9] [,10] [,11] [,12] [,13] [,14] [,15] [,16] [,17] [,18] [,19] [,20] [,21]
[1,] 1000 2000 3000 4000 5000  6000  7000  8000  9000 10000 11000 12000 13000 14000 15000 16000 17000 18000 19000 20000 21000
[2,] 2000 3000 4000 5000 6000  7000  8000  9000 10000 11000 12000 13000 14000 15000 16000 17000 18000 19000 20000 21000 22000
[3,] 3000 4000 5000 6000 7000  8000  9000 10000 11000 12000 13000 14000 15000 16000 17000 18000 19000 20000 21000 22000 23000
[4,] 4000 5000 6000 7000 8000  9000 10000 11000 12000 13000 14000 15000 16000 17000 18000 19000 20000 21000 22000 23000 24000
[5,] 5000 6000 7000 8000 9000 10000 11000 12000 13000 14000 15000 16000 17000 18000 19000 20000 21000 22000 23000 24000 25000

如何交换尺寸?

【问题讨论】:

    标签: r keras lstm


    【解决方案1】:

    我设法让它工作,但这可能不是 R 方式...

    x <- 1:25
    y <- seq(100, 2500, by = 100)
    z <- seq(1000, 25000, by = 1000)
    
    my.matrix <- data.matrix(data.frame(x, y, z))
    my.matrix <- cbind(x, y, z)
    str(my.matrix)
    
    window.width <- 5
    
    result <- array(data = NA_real_, dim = c(dim(my.matrix)[1] - window.width + 1, window.width, dim(my.matrix)[2]))
    # Loop over features
    for (k in 1:dim(my.matrix)[2]) {
      # Loop over window
      for (j in 1:window.width) {
        # Loop over observations
        for(i in 1:(dim(my.matrix)[1] - window.width + 1)) {
          result[i, j, k] = my.matrix[i - 1 + j, k]
        }
      }
    }
    

    如果有人找到更好、更有效的方法,我将不胜感激。与此同时,这是可行的:

    > result
    , , 1
    
          [,1] [,2] [,3] [,4] [,5]
     [1,]    1    2    3    4    5
     [2,]    2    3    4    5    6
     [3,]    3    4    5    6    7
     [4,]    4    5    6    7    8
     [5,]    5    6    7    8    9
     [6,]    6    7    8    9   10
     [7,]    7    8    9   10   11
     [8,]    8    9   10   11   12
     [9,]    9   10   11   12   13
    [10,]   10   11   12   13   14
    [11,]   11   12   13   14   15
    [12,]   12   13   14   15   16
    [13,]   13   14   15   16   17
    [14,]   14   15   16   17   18
    [15,]   15   16   17   18   19
    [16,]   16   17   18   19   20
    [17,]   17   18   19   20   21
    [18,]   18   19   20   21   22
    [19,]   19   20   21   22   23
    [20,]   20   21   22   23   24
    [21,]   21   22   23   24   25
    
    , , 2
    
          [,1] [,2] [,3] [,4] [,5]
     [1,]  100  200  300  400  500
     [2,]  200  300  400  500  600
     [3,]  300  400  500  600  700
     [4,]  400  500  600  700  800
     [5,]  500  600  700  800  900
     [6,]  600  700  800  900 1000
     [7,]  700  800  900 1000 1100
     [8,]  800  900 1000 1100 1200
     [9,]  900 1000 1100 1200 1300
    [10,] 1000 1100 1200 1300 1400
    [11,] 1100 1200 1300 1400 1500
    [12,] 1200 1300 1400 1500 1600
    [13,] 1300 1400 1500 1600 1700
    [14,] 1400 1500 1600 1700 1800
    [15,] 1500 1600 1700 1800 1900
    [16,] 1600 1700 1800 1900 2000
    [17,] 1700 1800 1900 2000 2100
    [18,] 1800 1900 2000 2100 2200
    [19,] 1900 2000 2100 2200 2300
    [20,] 2000 2100 2200 2300 2400
    [21,] 2100 2200 2300 2400 2500
    
    , , 3
    
           [,1]  [,2]  [,3]  [,4]  [,5]
     [1,]  1000  2000  3000  4000  5000
     [2,]  2000  3000  4000  5000  6000
     [3,]  3000  4000  5000  6000  7000
     [4,]  4000  5000  6000  7000  8000
     [5,]  5000  6000  7000  8000  9000
     [6,]  6000  7000  8000  9000 10000
     [7,]  7000  8000  9000 10000 11000
     [8,]  8000  9000 10000 11000 12000
     [9,]  9000 10000 11000 12000 13000
    [10,] 10000 11000 12000 13000 14000
    [11,] 11000 12000 13000 14000 15000
    [12,] 12000 13000 14000 15000 16000
    [13,] 13000 14000 15000 16000 17000
    [14,] 14000 15000 16000 17000 18000
    [15,] 15000 16000 17000 18000 19000
    [16,] 16000 17000 18000 19000 20000
    [17,] 17000 18000 19000 20000 21000
    [18,] 18000 19000 20000 21000 22000
    [19,] 19000 20000 21000 22000 23000
    [20,] 20000 21000 22000 23000 24000
    [21,] 21000 22000 23000 24000 25000
    

    【讨论】:

      猜你喜欢
      • 2011-01-16
      • 2016-11-08
      • 1970-01-01
      • 1970-01-01
      • 2021-09-14
      • 2012-01-29
      • 1970-01-01
      • 1970-01-01
      • 2021-05-25
      相关资源
      最近更新 更多