【问题标题】:Deep learning AI for integers in a sequence序列中整数的深度学习 AI
【发布时间】:2021-08-17 11:10:29
【问题描述】:

我是 ML 新手,我想使用 keras 将序列中的每个数字分类为 1 或 0,具体取决于它是否大于前一个数字。也就是说,如果我有:

序列 a = [1, 2, 6, 4, 5],

解决方案应该是: 序列 b = [0, 1, 1, 0, 1]。

到目前为止,我已经写了:

import tensorflow as tf
from tensorflow import keras
from keras.models import Sequential
from keras.layers import Dense

model = tf.keras.Sequential([tf.keras.layers.Dense(units=1, input_shape=[1,1])])
model.add(tf.keras.layers.Dense(17))
model.add(tf.keras.layers.Dense(17))
model.compile(optimizer='sgd', loss='BinaryCrossentropy', metrics=['binary_accuracy'])

b = [1,6,8,3,5,8,90,5,432,3,5,6,8,8,4,234,0]
a = [0,1,1,0,1,1,1,0,1,0,1,1,1,0,0,1,0]
b = np.array(b, dtype=float)
a = np.array(a, dtype=float)
model.fit(b, a, epochs=500, batch_size=1)

# # Generate predictions for samples
predictions = model.predict(b)
print(predictions)

当我这样做时,我会得到:

Epoch 500/500
17/17 [==============================] - 0s 499us/step - loss: 7.9229 - binary_accuracy: 0.4844
[[[-1.37064695e+01  4.70858345e+01 -4.67341652e+01 -1.94298875e+00
    5.75960045e+01  6.70146179e+01  6.34545479e+01 -4.86319550e+02
    2.26250134e+01 -8.60109329e+00 -4.03220863e+01 -1.67574768e+01
    3.36148148e+01 -4.55171967e+00 -1.39924898e+01  6.31023712e+01
   -9.14120102e+00]]

 [[-6.92644653e+01  2.40270264e+02 -2.37715302e+02 -9.42625141e+00
    2.93314209e+02  3.41092743e+02  3.23760315e+02 -2.49306396e+03
    1.15242020e+02 -4.38339310e+01 -2.05973328e+02 -8.48139114e+01
    1.70274872e+02 -2.48692398e+01 -7.15372696e+01  3.22131958e+02
   -4.57872620e+01]]

 [[-9.14876480e+01  3.17544006e+02 -3.14107819e+02 -1.24195509e+01
    3.87601562e+02  4.50723969e+02  4.27882660e+02 -3.29576172e+03
    1.52288818e+02 -5.79270554e+01 -2.72233856e+02 -1.12036469e+02
    2.24938889e+02 -3.29962883e+01 -9.45551834e+01  4.25743744e+02
   -6.04456978e+01]]

 [[-3.59296684e+01  1.24359612e+02 -1.23126640e+02 -4.93629456e+00
    1.51883270e+02  1.76645889e+02  1.67576874e+02 -1.28901733e+03
    5.96718216e+01 -2.26942272e+01 -1.06582588e+02 -4.39800491e+01
    8.82788391e+01 -1.26787395e+01 -3.70104065e+01  1.66714172e+02
   -2.37996235e+01]]

 [[-5.81528549e+01  2.01633392e+02 -1.99519104e+02 -7.92959309e+00
    2.46170563e+02  2.86277161e+02  2.71699158e+02 -2.09171509e+03
    9.67186279e+01 -3.67873497e+01 -1.72843094e+02 -7.12026062e+01
    1.42942856e+02 -2.08057709e+01 -6.00283318e+01  2.70326050e+02
   -3.84580460e+01]]

 [[-9.14876480e+01  3.17544006e+02 -3.14107819e+02 -1.24195509e+01
    3.87601562e+02  4.50723969e+02  4.27882660e+02 -3.29576172e+03
    1.52288818e+02 -5.79270554e+01 -2.72233856e+02 -1.12036469e+02
    2.24938889e+02 -3.29962883e+01 -9.45551834e+01  4.25743744e+02
   -6.04456978e+01]]

 [[-1.00263879e+03  3.48576855e+03 -3.44619800e+03 -1.35145050e+02
    4.25337939e+03  4.94560596e+03  4.69689697e+03 -3.62063594e+04
    1.67120789e+03 -6.35745117e+02 -2.98891406e+03 -1.22816174e+03
    2.46616406e+03 -3.66204163e+02 -1.03828992e+03  4.67382764e+03
   -6.61441223e+02]]

 [[-5.81528549e+01  2.01633392e+02 -1.99519104e+02 -7.92959309e+00
    2.46170563e+02  2.86277161e+02  2.71699158e+02 -2.09171509e+03
    9.67186279e+01 -3.67873497e+01 -1.72843094e+02 -7.12026062e+01
    1.42942856e+02 -2.08057709e+01 -6.00283318e+01  2.70326050e+02
   -3.84580460e+01]]

 [[-4.80280518e+03  1.66995840e+04 -1.65093086e+04 -6.47000305e+02
    2.03765059e+04  2.36925508e+04  2.25018145e+04 -1.73467625e+05
    8.00621289e+03 -3.04566919e+03 -1.43194590e+04 -5.88322070e+03
    1.18137129e+04 -1.75592432e+03 -4.97435352e+03  2.23914492e+04
   -3.16803076e+03]]

 [[-3.59296684e+01  1.24359612e+02 -1.23126640e+02 -4.93629456e+00
    1.51883270e+02  1.76645889e+02  1.67576874e+02 -1.28901733e+03
    5.96718216e+01 -2.26942272e+01 -1.06582588e+02 -4.39800491e+01
    8.82788391e+01 -1.26787395e+01 -3.70104065e+01  1.66714172e+02
   -2.37996235e+01]]

 [[-5.81528549e+01  2.01633392e+02 -1.99519104e+02 -7.92959309e+00
    2.46170563e+02  2.86277161e+02  2.71699158e+02 -2.09171509e+03
    9.67186279e+01 -3.67873497e+01 -1.72843094e+02 -7.12026062e+01
    1.42942856e+02 -2.08057709e+01 -6.00283318e+01  2.70326050e+02
   -3.84580460e+01]]

 [[-6.92644653e+01  2.40270264e+02 -2.37715302e+02 -9.42625141e+00
    2.93314209e+02  3.41092743e+02  3.23760315e+02 -2.49306396e+03
    1.15242020e+02 -4.38339310e+01 -2.05973328e+02 -8.48139114e+01
    1.70274872e+02 -2.48692398e+01 -7.15372696e+01  3.22131958e+02
   -4.57872620e+01]]

 [[-9.14876480e+01  3.17544006e+02 -3.14107819e+02 -1.24195509e+01
    3.87601562e+02  4.50723969e+02  4.27882660e+02 -3.29576172e+03
    1.52288818e+02 -5.79270554e+01 -2.72233856e+02 -1.12036469e+02
    2.24938889e+02 -3.29962883e+01 -9.45551834e+01  4.25743744e+02
   -6.04456978e+01]]

 [[-9.14876480e+01  3.17544006e+02 -3.14107819e+02 -1.24195509e+01
    3.87601562e+02  4.50723969e+02  4.27882660e+02 -3.29576172e+03
    1.52288818e+02 -5.79270554e+01 -2.72233856e+02 -1.12036469e+02
    2.24938889e+02 -3.29962883e+01 -9.45551834e+01  4.25743744e+02
   -6.04456978e+01]]

 [[-4.70412598e+01  1.62996490e+02 -1.61322891e+02 -6.43295908e+00
    1.99026932e+02  2.31461517e+02  2.19638016e+02 -1.69036609e+03
    7.81952209e+01 -2.97407875e+01 -1.39712814e+02 -5.75913391e+01
    1.15610855e+02 -1.67422562e+01 -4.85193672e+01  2.18520096e+02
   -3.11288433e+01]]

 [[-2.60270850e+03  9.04948047e+03 -8.94645508e+03 -3.50663330e+02
    1.10420654e+04  1.28390557e+04  1.21937041e+04 -9.40005859e+04
    4.33857861e+03 -1.65045227e+03 -7.75966846e+03 -3.18818774e+03
    6.40197412e+03 -9.51349304e+02 -2.69557886e+03  1.21338779e+04
   -1.71684766e+03]]

 [[-2.59487200e+00  8.44894505e+00 -8.53793907e+00 -4.46333081e-01
    1.04523640e+01  1.21989994e+01  1.13933916e+01 -8.49708328e+01
    4.10160637e+00 -1.55452514e+00 -7.19183874e+00 -3.14619255e+00
    6.28279734e+00 -4.88203079e-01 -2.48353434e+00  1.12964716e+01
   -1.81198704e+00]]]

【问题讨论】:

    标签: python tensorflow machine-learning keras deep-learning


    【解决方案1】:

    您的处理方式几乎没有问题 -

    1. 您针对深度学习问题的设置存在缺陷。您想使用前一个元素的信息来推断下一个元素的标签。但是对于推理(和训练),您只传递当前元素。如果明天我部署这个模型,想象一下会发生什么。我将为您提供的唯一信息,比如“15”,如果它比前一个元素大,那么它不存在。您的模型将如何响应?

    2. 其次,为什么您的输出层预测的是 17 维向量?目标不应该是预测 0 或 1(概率)吗?在这种情况下,您的输出应该是具有 sigmoid 激活的单个元素。请参阅此图作为您未来神经网络设置的指南。

    1. 第三,您没有使用任何激活函数,这是使用神经网络(非线性)的核心原因。如果没有激活函数,您只是在构建标准回归模型。这是一个基本证明 -
    #2 layer neural network without activation
    h = W1.X+B1
    o = W2.h+B2
    
    o = W2.(W1.X+B1)+B2
      = W2.W1.X + (W1.B1+B2)
      = W3.X + B3            #Same as linear regression!
    
    
    #2 layer neural network with activations.
    h = activation(W1.X+B1)
    o = activation(W2.h+B2)
    

    我建议从神经网络的基础开始,首先构建最佳实践,然后再着手提出自己的问题陈述。 Keras 作者Fchollet 有一些优秀的starter notebooks 供您探索。

    根据您的情况,尝试这些修改 -

    import tensorflow as tf
    from tensorflow import keras
    from keras.models import Sequential
    from keras.layers import Dense
    
    
    #Modify input shape and output shape + add activations
    model = tf.keras.Sequential([tf.keras.layers.Dense(units=1, input_shape=(2,))])       #<------
    model.add(tf.keras.layers.Dense(17, activation='relu'))     #<------
    model.add(tf.keras.layers.Dense(1, activation='sigmoid'))    #<------
    model.compile(optimizer='sgd', loss='BinaryCrossentropy', metrics=['binary_accuracy'])
    
    #create 2 features, 1st is previous element 2nd is current element
    b = [1,6,8,3,5,8,90,5,432,3,5,6,8,8,4,234,0]
    b = np.array([i for i in zip(b,b[1:])])         #<---- (16,2)
    
    #Start from first paid of elements
    a = np.array([0,1,1,0,1,1,1,0,1,0,1,1,1,0,0,1,0])[1:] #<---- (16,)
    
    
    model.fit(b, a, epochs=20, batch_size=1)
    
    # # Generate predictions for samples
    predictions = model.predict(b)
    print(np.round(predictions))
    
    Epoch 1/20
    16/16 [==============================] - 0s 1ms/step - loss: 3.0769 - binary_accuracy: 0.7086
    Epoch 2/20
    16/16 [==============================] - 0s 823us/step - loss: 252.6490 - binary_accuracy: 0.6153
    Epoch 3/20
    16/16 [==============================] - 0s 1ms/step - loss: 3.8109 - binary_accuracy: 0.9212
    Epoch 4/20
    16/16 [==============================] - 0s 787us/step - loss: 0.0131 - binary_accuracy: 0.9845
    Epoch 5/20
    16/16 [==============================] - 0s 2ms/step - loss: 0.0767 - binary_accuracy: 1.0000
    Epoch 6/20
    16/16 [==============================] - 0s 1ms/step - loss: 0.0143 - binary_accuracy: 0.9800
    Epoch 7/20
    16/16 [==============================] - 0s 2ms/step - loss: 0.0111 - binary_accuracy: 1.0000
    Epoch 8/20
    16/16 [==============================] - 0s 2ms/step - loss: 4.0658e-04 - binary_accuracy: 1.0000
    Epoch 9/20
    16/16 [==============================] - 0s 941us/step - loss: 6.3996e-04 - binary_accuracy: 1.0000
    Epoch 10/20
    16/16 [==============================] - 0s 1ms/step - loss: 1.1477e-04 - binary_accuracy: 1.0000
    Epoch 11/20
    16/16 [==============================] - 0s 837us/step - loss: 6.8807e-04 - binary_accuracy: 1.0000
    Epoch 12/20
    16/16 [==============================] - 0s 2ms/step - loss: 5.0521e-04 - binary_accuracy: 1.0000
    Epoch 13/20
    16/16 [==============================] - 0s 851us/step - loss: 0.0015 - binary_accuracy: 1.0000
    Epoch 14/20
    16/16 [==============================] - 0s 1ms/step - loss: 0.0012 - binary_accuracy: 1.0000
    Epoch 15/20
    16/16 [==============================] - 0s 765us/step - loss: 0.0014 - binary_accuracy: 1.0000
    Epoch 16/20
    16/16 [==============================] - 0s 906us/step - loss: 3.9230e-04 - binary_accuracy: 1.0000
    Epoch 17/20
    16/16 [==============================] - 0s 1ms/step - loss: 0.0022 - binary_accuracy: 1.0000
    Epoch 18/20
    16/16 [==============================] - 0s 1ms/step - loss: 2.2149e-04 - binary_accuracy: 1.0000
    Epoch 19/20
    16/16 [==============================] - 0s 2ms/step - loss: 1.7345e-04 - binary_accuracy: 1.0000
    Epoch 20/20
    16/16 [==============================] - 0s 1ms/step - loss: 7.7950e-05 - binary_accuracy: 1.0000
    
    
    [[1.]
     [1.]
     [0.]
     [1.]
     [1.]
     [1.]
     [0.]
     [1.]
     [0.]
     [1.]
     [1.]
     [1.]
     [0.]
     [0.]
     [1.]
     [0.]]
    

    上述模型很容易训练,因为问题不是一个复杂的问题。您可以看到准确率很快达到 100%。让我们尝试使用这个新模型对未见数据进行预测 -

    np.round(model.predict([[5,1],     #<- Is 5 < 1
                            [5,500],   #<- Is 5 < 500
                            [5,6]]))   #<- Is 5 < 6
    
    array([[0.],                       #<- No
           [1.],                       #<- Yes
           [1.]], dtype=float32)       #<- Yes
    

    【讨论】:

    • 非常感谢。作为深度学习的新手,这正是我所希望的反馈类型。我一定会在 Fchollet 笔记本上听取您的建议。
    【解决方案2】:

    问题是你的输出层有 17 个神经元。这根本不符合逻辑。对于这样的二元选择,您可能希望在输出端有 1 或 2 个神经元。

    将最后一层改为:

    model.add(tf.keras.layers.Dense(1, activation='sigmoid'))
    

    您将为每个输入获得一个输出预测。当你得到概率而不是 1 和 0 值时,你将不得不用例如四舍五入它们。 np.round.

    使用 Sigmoid 激活函数来获得 0 和 1 之间的概率。使用 1 个输出神经元,因为您的输出是二元选择,它只能具有 1 个状态。

    但是,这只是解决了代码中的问题。我认为密集神经网络不是解决您的问题的正确选择,并且可能很难学习任何有用的东西。

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 2021-01-30
      • 2016-09-07
      • 1970-01-01
      • 2018-11-05
      • 1970-01-01
      • 1970-01-01
      相关资源
      最近更新 更多