【问题标题】:将尺寸为 (3,50) 的嵌入层连接到 lstm
【发布时间】:2021-12-27 22:46:11
【问题描述】:

如何将尺寸为 (3,50) 的嵌入层连接到 lstm?

数组 (3, 50) 被馈送到输入“layer_i_emb”,其中存储了长度为 50 的数组的三个时间步长,其中存储了产品标识符

我尝试在 reshape 之前连接它,但它也不起作用。嵌入增加了维度,lstm 不需要额外的维度。您必须将张量转换为 tf 并手动使用张量,这很可怕。

layer_i_inp = Input(shape = (3,50), name = 'item')
layer_i_emb = Embedding(output_dim = EMBEDDING_DIM*2,
                        input_dim = us_it_count[0]+1,
                        input_length = (3,50),
                        name = 'item_embedding')(layer_i_inp) 

layer_i_emb = Reshape([3,50, EMBEDDING_DIM*2])(layer_i_emb)

layer_i_emb = LSTM(MAX_FEATURES, dropout = 0.4, recurrent_dropout = 0.4, return_sequences = True)(layer_i_emb)
layer_i_emb = LSTM(MAX_FEATURES, dropout = 0.4, recurrent_dropout = 0.4, return_sequences = True)(layer_i_emb)
layer_i_emb = LSTM(MAX_FEATURES, dropout = 0.4, recurrent_dropout = 0.4)(layer_i_emb)

layer_i_emb = Flatten()(layer_i_emb)

【问题讨论】:

    标签: python tensorflow keras recurrent-neural-network embedding


    【解决方案1】:

    问题在于Embedding 层正在输出 3D 张量,但LSTM 层需要 2D 输入(不包括批处理维度)。您可以尝试以下几个选项:

    选项 1

    import tensorflow as tf
    
    samples = 100
    orders = 3
    product_ids_per_order = 50
    max_product_id = 120
    
    data = tf.random.uniform((samples, orders, product_ids_per_order), maxval=max_product_id, dtype=tf.int32)
    Y = tf.random.uniform((samples,), maxval=2, dtype=tf.int32)
    
    EMBEDDING_DIM = 5
    
    item_input = tf.keras.layers.Input(shape = (orders, product_ids_per_order), name = 'item')
    embedding_layer = tf.keras.layers.Embedding(
                            max_product_id + 1,
                            output_dim = EMBEDDING_DIM,
                            input_length = product_ids_per_order,
                            name = 'item_embedding')
    
    # Map each time step with 50 product ids to an embedding vector of size 5
    outputs = []
    for i in range(orders):
      tensor = embedding_layer(item_input[:, i, :])
      layer_i_emb = tf.keras.layers.LSTM(32, dropout = 0.4, recurrent_dropout = 0.4, return_sequences = True)(tensor)
      layer_i_emb = tf.keras.layers.LSTM(32, dropout = 0.4, recurrent_dropout = 0.4, return_sequences = True)(layer_i_emb)
      layer_i_emb = tf.keras.layers.LSTM(32, dropout = 0.4, recurrent_dropout = 0.4)(layer_i_emb)
      outputs.append(layer_i_emb)
      
    output = tf.keras.layers.Concatenate(axis=1)(outputs)
    output = tf.keras.layers.Dense(1, activation='sigmoid')(layer_i_emb)
    model = tf.keras.Model(item_input, output)
    model.compile(optimizer='adam', loss=tf.keras.losses.BinaryCrossentropy())
    model.fit(data, Y)
    
    4/4 [==============================] - 15s 1s/step - loss: 0.6926
    

    选项 2

    import tensorflow as tf
    
    samples = 100
    orders = 3
    product_ids_per_order = 50
    max_product_id = 120
    
    EMBEDDING_DIM = 5
    
    item_input = tf.keras.layers.Input(shape = (orders, product_ids_per_order), name = 'item')
    embedding_layer = tf.keras.layers.Embedding(
                            max_product_id + 1,
                            output_dim = EMBEDDING_DIM,
                            input_length = product_ids_per_order,
                            name = 'item_embedding')
    
    # Map each time step with 50 product ids to an embedding vector of size 5
    inputs = []
    for i in range(orders):
      tensor = embedding_layer(item_input[:, i, :])
      tensor = tf.keras.layers.Reshape([product_ids_per_order*EMBEDDING_DIM])(tensor)
      tensor = tf.expand_dims(tensor, axis=1)
      inputs.append(tensor)
    
    embedding_inputs = tf.keras.layers.Concatenate(axis=1)(inputs)
    layer_i_emb = tf.keras.layers.LSTM(32, dropout = 0.4, recurrent_dropout = 0.4, return_sequences = True)(embedding_inputs)
    layer_i_emb = tf.keras.layers.LSTM(32, dropout = 0.4, recurrent_dropout = 0.4, return_sequences = True)(layer_i_emb)
    layer_i_emb = tf.keras.layers.LSTM(32, dropout = 0.4, recurrent_dropout = 0.4)(layer_i_emb)
    output = tf.keras.layers.Dense(1, activation='sigmoid')(layer_i_emb)
    model = tf.keras.Model(item_input, output)
    print(model.summary())
    
    Model: "model_11"
    __________________________________________________________________________________________________
     Layer (type)                   Output Shape         Param #     Connected to                     
    ==================================================================================================
     item (InputLayer)              [(None, 3, 50)]      0           []                               
                                                                                                      
     tf.__operators__.getitem_41 (S  (None, 50)          0           ['item[0][0]']                   
     licingOpLambda)                                                                                  
                                                                                                      
     tf.__operators__.getitem_42 (S  (None, 50)          0           ['item[0][0]']                   
     licingOpLambda)                                                                                  
                                                                                                      
     tf.__operators__.getitem_43 (S  (None, 50)          0           ['item[0][0]']                   
     licingOpLambda)                                                                                  
                                                                                                      
     item_embedding (Embedding)     (None, 50, 5)        605         ['tf.__operators__.getitem_41[0][
                                                                     0]',                             
                                                                      'tf.__operators__.getitem_42[0][
                                                                     0]',                             
                                                                      'tf.__operators__.getitem_43[0][
                                                                     0]']                             
                                                                                                      
     reshape_10 (Reshape)           (None, 250)          0           ['item_embedding[0][0]']         
                                                                                                      
     reshape_11 (Reshape)           (None, 250)          0           ['item_embedding[1][0]']         
                                                                                                      
     reshape_12 (Reshape)           (None, 250)          0           ['item_embedding[2][0]']         
                                                                                                      
     tf.expand_dims_9 (TFOpLambda)  (None, 1, 250)       0           ['reshape_10[0][0]']             
                                                                                                      
     tf.expand_dims_10 (TFOpLambda)  (None, 1, 250)      0           ['reshape_11[0][0]']             
                                                                                                      
     tf.expand_dims_11 (TFOpLambda)  (None, 1, 250)      0           ['reshape_12[0][0]']             
                                                                                                      
     concatenate_13 (Concatenate)   (None, 3, 250)       0           ['tf.expand_dims_9[0][0]',       
                                                                      'tf.expand_dims_10[0][0]',      
                                                                      'tf.expand_dims_11[0][0]']      
                                                                                                      
     lstm_34 (LSTM)                 (None, 3, 32)        36224       ['concatenate_13[0][0]']         
                                                                                                      
     lstm_35 (LSTM)                 (None, 3, 32)        8320        ['lstm_34[0][0]']                
                                                                                                      
     lstm_36 (LSTM)                 (None, 32)           8320        ['lstm_35[0][0]']                
                                                                                                      
     dense_11 (Dense)               (None, 1)            33          ['lstm_36[0][0]']                
                                                                                                      
    ==================================================================================================
    Total params: 53,502
    Trainable params: 53,502
    Non-trainable params: 0
    __________________________________________________________________________________________________
    None
    

    【讨论】:

    • 其中一些已经在 colab 的代码中。以及如何处理它尚不清楚。张量形式是这样的,因为它在每个步骤中考虑了三个时间步,其中我们有一篮子商品(50 个产品 id)并且产品 id 应该是嵌入的形式,因为它们有 2 万个),即我每个时间步提交 50 个产品 ID。嵌入决定确保网络不考虑第 3 个产品的 id 和第 4 个比第 3 个、第 1 个和第 50 个更相似的产品。
    • 我不明白你到底想做什么......你想为你的第一个 LSTM 层提供什么样的输入?你的数据是什么样的?你能在你的问题中添加一些例子吗?
    • 是的,您正确理解了输入的所有内容。输入 [(无, 3, 50)]。这是三个订单,每个订单包含 50 个产品。如何向网络解释这一点?如果我提交时没有嵌入,那么网络会感知到 4
    • 我们有三个这样的数组(三个订单,三个时间步长,以便网络了解用户之前购买了什么),所以我进入 LSTM 层,它将首先了解前 50 个嵌入然后是50多个,然后是50多个))所以我想如何实现它?这样网络才能理解这些是商品的 id 而不是数值,并且 LSTM 层会分别接收有关每个订单的数据(一次通过 3 个订单)
    • 更新答案。
    猜你喜欢
    • 1970-01-01
    • 1970-01-01
    • 2021-06-04
    • 1970-01-01
    • 1970-01-01
    • 2021-01-18
    • 1970-01-01
    • 2018-09-19
    • 1970-01-01
    相关资源
    最近更新 更多