序列中整数的深度学习 AI答案

【问题标题】：Deep learning AI for integers in a sequence序列中整数的深度学习 AI
【发布时间】：2021-08-17 11:10:29
【问题描述】：

我是 ML 新手，我想使用 keras 将序列中的每个数字分类为 1 或 0，具体取决于它是否大于前一个数字。也就是说，如果我有：

序列 a = [1, 2, 6, 4, 5],

解决方案应该是：序列 b = [0, 1, 1, 0, 1]。

到目前为止，我已经写了：

import tensorflow as tf
from tensorflow import keras
from keras.models import Sequential
from keras.layers import Dense

model = tf.keras.Sequential([tf.keras.layers.Dense(units=1, input_shape=[1,1])])
model.add(tf.keras.layers.Dense(17))
model.add(tf.keras.layers.Dense(17))
model.compile(optimizer='sgd', loss='BinaryCrossentropy', metrics=['binary_accuracy'])

b = [1,6,8,3,5,8,90,5,432,3,5,6,8,8,4,234,0]
a = [0,1,1,0,1,1,1,0,1,0,1,1,1,0,0,1,0]
b = np.array(b, dtype=float)
a = np.array(a, dtype=float)
model.fit(b, a, epochs=500, batch_size=1)

# # Generate predictions for samples
predictions = model.predict(b)
print(predictions)

当我这样做时，我会得到：

Epoch 500/500
17/17 [==============================] - 0s 499us/step - loss: 7.9229 - binary_accuracy: 0.4844
[[[-1.37064695e+01  4.70858345e+01 -4.67341652e+01 -1.94298875e+00
    5.75960045e+01  6.70146179e+01  6.34545479e+01 -4.86319550e+02
    2.26250134e+01 -8.60109329e+00 -4.03220863e+01 -1.67574768e+01
    3.36148148e+01 -4.55171967e+00 -1.39924898e+01  6.31023712e+01
   -9.14120102e+00]]

 [[-6.92644653e+01  2.40270264e+02 -2.37715302e+02 -9.42625141e+00
    2.93314209e+02  3.41092743e+02  3.23760315e+02 -2.49306396e+03
    1.15242020e+02 -4.38339310e+01 -2.05973328e+02 -8.48139114e+01
    1.70274872e+02 -2.48692398e+01 -7.15372696e+01  3.22131958e+02
   -4.57872620e+01]]

 [[-9.14876480e+01  3.17544006e+02 -3.14107819e+02 -1.24195509e+01
    3.87601562e+02  4.50723969e+02  4.27882660e+02 -3.29576172e+03
    1.52288818e+02 -5.79270554e+01 -2.72233856e+02 -1.12036469e+02
    2.24938889e+02 -3.29962883e+01 -9.45551834e+01  4.25743744e+02
   -6.04456978e+01]]

 [[-3.59296684e+01  1.24359612e+02 -1.23126640e+02 -4.93629456e+00
    1.51883270e+02  1.76645889e+02  1.67576874e+02 -1.28901733e+03
    5.96718216e+01 -2.26942272e+01 -1.06582588e+02 -4.39800491e+01
    8.82788391e+01 -1.26787395e+01 -3.70104065e+01  1.66714172e+02
   -2.37996235e+01]]

 [[-5.81528549e+01  2.01633392e+02 -1.99519104e+02 -7.92959309e+00
    2.46170563e+02  2.86277161e+02  2.71699158e+02 -2.09171509e+03
    9.67186279e+01 -3.67873497e+01 -1.72843094e+02 -7.12026062e+01
    1.42942856e+02 -2.08057709e+01 -6.00283318e+01  2.70326050e+02
   -3.84580460e+01]]

 [[-9.14876480e+01  3.17544006e+02 -3.14107819e+02 -1.24195509e+01
    3.87601562e+02  4.50723969e+02  4.27882660e+02 -3.29576172e+03
    1.52288818e+02 -5.79270554e+01 -2.72233856e+02 -1.12036469e+02
    2.24938889e+02 -3.29962883e+01 -9.45551834e+01  4.25743744e+02
   -6.04456978e+01]]

 [[-1.00263879e+03  3.48576855e+03 -3.44619800e+03 -1.35145050e+02
    4.25337939e+03  4.94560596e+03  4.69689697e+03 -3.62063594e+04
    1.67120789e+03 -6.35745117e+02 -2.98891406e+03 -1.22816174e+03
    2.46616406e+03 -3.66204163e+02 -1.03828992e+03  4.67382764e+03
   -6.61441223e+02]]

 [[-5.81528549e+01  2.01633392e+02 -1.99519104e+02 -7.92959309e+00
    2.46170563e+02  2.86277161e+02  2.71699158e+02 -2.09171509e+03
    9.67186279e+01 -3.67873497e+01 -1.72843094e+02 -7.12026062e+01
    1.42942856e+02 -2.08057709e+01 -6.00283318e+01  2.70326050e+02
   -3.84580460e+01]]

 [[-4.80280518e+03  1.66995840e+04 -1.65093086e+04 -6.47000305e+02
    2.03765059e+04  2.36925508e+04  2.25018145e+04 -1.73467625e+05
    8.00621289e+03 -3.04566919e+03 -1.43194590e+04 -5.88322070e+03
    1.18137129e+04 -1.75592432e+03 -4.97435352e+03  2.23914492e+04
   -3.16803076e+03]]

 [[-3.59296684e+01  1.24359612e+02 -1.23126640e+02 -4.93629456e+00
    1.51883270e+02  1.76645889e+02  1.67576874e+02 -1.28901733e+03
    5.96718216e+01 -2.26942272e+01 -1.06582588e+02 -4.39800491e+01
    8.82788391e+01 -1.26787395e+01 -3.70104065e+01  1.66714172e+02
   -2.37996235e+01]]

 [[-5.81528549e+01  2.01633392e+02 -1.99519104e+02 -7.92959309e+00
    2.46170563e+02  2.86277161e+02  2.71699158e+02 -2.09171509e+03
    9.67186279e+01 -3.67873497e+01 -1.72843094e+02 -7.12026062e+01
    1.42942856e+02 -2.08057709e+01 -6.00283318e+01  2.70326050e+02
   -3.84580460e+01]]

 [[-6.92644653e+01  2.40270264e+02 -2.37715302e+02 -9.42625141e+00
    2.93314209e+02  3.41092743e+02  3.23760315e+02 -2.49306396e+03
    1.15242020e+02 -4.38339310e+01 -2.05973328e+02 -8.48139114e+01
    1.70274872e+02 -2.48692398e+01 -7.15372696e+01  3.22131958e+02
   -4.57872620e+01]]

 [[-9.14876480e+01  3.17544006e+02 -3.14107819e+02 -1.24195509e+01
    3.87601562e+02  4.50723969e+02  4.27882660e+02 -3.29576172e+03
    1.52288818e+02 -5.79270554e+01 -2.72233856e+02 -1.12036469e+02
    2.24938889e+02 -3.29962883e+01 -9.45551834e+01  4.25743744e+02
   -6.04456978e+01]]

 [[-9.14876480e+01  3.17544006e+02 -3.14107819e+02 -1.24195509e+01
    3.87601562e+02  4.50723969e+02  4.27882660e+02 -3.29576172e+03
    1.52288818e+02 -5.79270554e+01 -2.72233856e+02 -1.12036469e+02
    2.24938889e+02 -3.29962883e+01 -9.45551834e+01  4.25743744e+02
   -6.04456978e+01]]

 [[-4.70412598e+01  1.62996490e+02 -1.61322891e+02 -6.43295908e+00
    1.99026932e+02  2.31461517e+02  2.19638016e+02 -1.69036609e+03
    7.81952209e+01 -2.97407875e+01 -1.39712814e+02 -5.75913391e+01
    1.15610855e+02 -1.67422562e+01 -4.85193672e+01  2.18520096e+02
   -3.11288433e+01]]

 [[-2.60270850e+03  9.04948047e+03 -8.94645508e+03 -3.50663330e+02
    1.10420654e+04  1.28390557e+04  1.21937041e+04 -9.40005859e+04
    4.33857861e+03 -1.65045227e+03 -7.75966846e+03 -3.18818774e+03
    6.40197412e+03 -9.51349304e+02 -2.69557886e+03  1.21338779e+04
   -1.71684766e+03]]

 [[-2.59487200e+00  8.44894505e+00 -8.53793907e+00 -4.46333081e-01
    1.04523640e+01  1.21989994e+01  1.13933916e+01 -8.49708328e+01
    4.10160637e+00 -1.55452514e+00 -7.19183874e+00 -3.14619255e+00
    6.28279734e+00 -4.88203079e-01 -2.48353434e+00  1.12964716e+01
   -1.81198704e+00]]]

【问题讨论】：

标签： python tensorflow machine-learning keras deep-learning

【解决方案1】：

您的处理方式几乎没有问题 -

您针对深度学习问题的设置存在缺陷。您想使用前一个元素的信息来推断下一个元素的标签。但是对于推理（和训练），您只传递当前元素。如果明天我部署这个模型，想象一下会发生什么。我将为您提供的唯一信息，比如“15”，如果它比前一个元素大，那么它不存在。您的模型将如何响应？
其次，为什么您的输出层预测的是 17 维向量？目标不应该是预测 0 或 1（概率）吗？在这种情况下，您的输出应该是具有 sigmoid 激活的单个元素。请参阅此图作为您未来神经网络设置的指南。

第三，您没有使用任何激活函数，这是使用神经网络（非线性）的核心原因。如果没有激活函数，您只是在构建标准回归模型。这是一个基本证明 -

#2 layer neural network without activation
h = W1.X+B1
o = W2.h+B2

o = W2.(W1.X+B1)+B2
  = W2.W1.X + (W1.B1+B2)
  = W3.X + B3            #Same as linear regression!


#2 layer neural network with activations.
h = activation(W1.X+B1)
o = activation(W2.h+B2)

我建议从神经网络的基础开始，首先构建最佳实践，然后再着手提出自己的问题陈述。 Keras 作者Fchollet 有一些优秀的starter notebooks 供您探索。

根据您的情况，尝试这些修改 -

import tensorflow as tf
from tensorflow import keras
from keras.models import Sequential
from keras.layers import Dense


#Modify input shape and output shape + add activations
model = tf.keras.Sequential([tf.keras.layers.Dense(units=1, input_shape=(2,))])       #<------
model.add(tf.keras.layers.Dense(17, activation='relu'))     #<------
model.add(tf.keras.layers.Dense(1, activation='sigmoid'))    #<------
model.compile(optimizer='sgd', loss='BinaryCrossentropy', metrics=['binary_accuracy'])

#create 2 features, 1st is previous element 2nd is current element
b = [1,6,8,3,5,8,90,5,432,3,5,6,8,8,4,234,0]
b = np.array([i for i in zip(b,b[1:])])         #<---- (16,2)

#Start from first paid of elements
a = np.array([0,1,1,0,1,1,1,0,1,0,1,1,1,0,0,1,0])[1:] #<---- (16,)


model.fit(b, a, epochs=20, batch_size=1)

# # Generate predictions for samples
predictions = model.predict(b)
print(np.round(predictions))

Epoch 1/20
16/16 [==============================] - 0s 1ms/step - loss: 3.0769 - binary_accuracy: 0.7086
Epoch 2/20
16/16 [==============================] - 0s 823us/step - loss: 252.6490 - binary_accuracy: 0.6153
Epoch 3/20
16/16 [==============================] - 0s 1ms/step - loss: 3.8109 - binary_accuracy: 0.9212
Epoch 4/20
16/16 [==============================] - 0s 787us/step - loss: 0.0131 - binary_accuracy: 0.9845
Epoch 5/20
16/16 [==============================] - 0s 2ms/step - loss: 0.0767 - binary_accuracy: 1.0000
Epoch 6/20
16/16 [==============================] - 0s 1ms/step - loss: 0.0143 - binary_accuracy: 0.9800
Epoch 7/20
16/16 [==============================] - 0s 2ms/step - loss: 0.0111 - binary_accuracy: 1.0000
Epoch 8/20
16/16 [==============================] - 0s 2ms/step - loss: 4.0658e-04 - binary_accuracy: 1.0000
Epoch 9/20
16/16 [==============================] - 0s 941us/step - loss: 6.3996e-04 - binary_accuracy: 1.0000
Epoch 10/20
16/16 [==============================] - 0s 1ms/step - loss: 1.1477e-04 - binary_accuracy: 1.0000
Epoch 11/20
16/16 [==============================] - 0s 837us/step - loss: 6.8807e-04 - binary_accuracy: 1.0000
Epoch 12/20
16/16 [==============================] - 0s 2ms/step - loss: 5.0521e-04 - binary_accuracy: 1.0000
Epoch 13/20
16/16 [==============================] - 0s 851us/step - loss: 0.0015 - binary_accuracy: 1.0000
Epoch 14/20
16/16 [==============================] - 0s 1ms/step - loss: 0.0012 - binary_accuracy: 1.0000
Epoch 15/20
16/16 [==============================] - 0s 765us/step - loss: 0.0014 - binary_accuracy: 1.0000
Epoch 16/20
16/16 [==============================] - 0s 906us/step - loss: 3.9230e-04 - binary_accuracy: 1.0000
Epoch 17/20
16/16 [==============================] - 0s 1ms/step - loss: 0.0022 - binary_accuracy: 1.0000
Epoch 18/20
16/16 [==============================] - 0s 1ms/step - loss: 2.2149e-04 - binary_accuracy: 1.0000
Epoch 19/20
16/16 [==============================] - 0s 2ms/step - loss: 1.7345e-04 - binary_accuracy: 1.0000
Epoch 20/20
16/16 [==============================] - 0s 1ms/step - loss: 7.7950e-05 - binary_accuracy: 1.0000


[[1.]
 [1.]
 [0.]
 [1.]
 [1.]
 [1.]
 [0.]
 [1.]
 [0.]
 [1.]
 [1.]
 [1.]
 [0.]
 [0.]
 [1.]
 [0.]]

上述模型很容易训练，因为问题不是一个复杂的问题。您可以看到准确率很快达到 100%。让我们尝试使用这个新模型对未见数据进行预测 -

np.round(model.predict([[5,1],     #<- Is 5 < 1
                        [5,500],   #<- Is 5 < 500
                        [5,6]]))   #<- Is 5 < 6

array([[0.],                       #<- No
       [1.],                       #<- Yes
       [1.]], dtype=float32)       #<- Yes

【讨论】：

非常感谢。作为深度学习的新手，这正是我所希望的反馈类型。我一定会在 Fchollet 笔记本上听取您的建议。

【解决方案2】：

问题是你的输出层有 17 个神经元。这根本不符合逻辑。对于这样的二元选择，您可能希望在输出端有 1 或 2 个神经元。

将最后一层改为：

model.add(tf.keras.layers.Dense(1, activation='sigmoid'))

您将为每个输入获得一个输出预测。当你得到概率而不是 1 和 0 值时，你将不得不用例如四舍五入它们。 np.round.

使用 Sigmoid 激活函数来获得 0 和 1 之间的概率。使用 1 个输出神经元，因为您的输出是二元选择，它只能具有 1 个状态。

但是，这只是解决了代码中的问题。我认为密集神经网络不是解决您的问题的正确选择，并且可能很难学习任何有用的东西。

【讨论】：