【发布时间】:2021-03-14 19:17:29
【问题描述】:
我正在尝试在神经网络中使用NFL tracking data 来预测一场比赛的码数结果。为此,我正在尝试使用 Keras LSTM 模型。
我的数据被格式化,使得 train_x 是一个 numpy 数组的列表,其中每个 numpy 数组是特定游戏的数据。 train_y 是一个包含播放结果列表的列表
train_x = [[[a11,b11,c11],[a12,b12,c12],...],[[a21,b21,c21],[a22,b22,c22],...], ...]
train_y = [[a1],[a2],...]
当我尝试使用这些数据来训练模型时:
embedding_vecor_length = 32
model = Sequential()
model.add(LSTM(100))
model.add(Dense(1, activation='sigmoid'))
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
model.fit(train_x, train_y, validation_data=(val_x, val_y), epochs=3, batch_size=64)
print(model.summary())
我收到此错误:
ValueError: Failed to convert a NumPy array to a Tensor (Unsupported object type numpy.ndarray).
这个错误的语言让我很困惑,为什么当对象类型是numpy数组时转换numpy数组会有问题?
上下文的完整代码:
lastWeek = pd.read_csv(r".\week1.csv", low_memory=False)
gameMax = lastWeek['gameId'].max()
weekPicklePath = "./week_1.pkl"
playPicklePath = "./play_all.pkl"
modPicklePath = "./modPlays_1.pk1"
trainXPicklePath = "./trainX1_1.pk1"
trainYPicklePath = "./trainY1_1.pk1"
testXPicklePath = "./testX1_1.pk1"
testYPicklePath = "./testY1_1.pk1"
valXPicklePath = "./valX1_1.pk1"
valYPicklePath = "./valY1_1.pk1"
pickle = True
try:
foo = pd.read_pickle(weekPicklePath)
except (OSError, IOError, FileNotFoundError) as e:
pickle = False
print("starting data preparation")
if(pickle):
week = foo
else:
print("week pickle not found")
week1 = pd.read_csv(r".\week1.csv", low_memory=False)
week = pd.concat([week1], ignore_index=True)
week = week[week['gameId'] <= gameMax]
week.to_pickle(weekPicklePath)
try:
plays = pd.read_pickle(playPicklePath)
except (OSError, IOError, FileNotFoundError) as e:
print("plays pickle not found")
plays = pd.read_csv(r".\plays.csv", low_memory=False)
plays.to_pickle(playPicklePath)
try:
modPlays = pd.read_pickle(modPicklePath)
except (OSError, IOError, FileNotFoundError) as e:
print("modPlays pickle not found")
modPlays = plays[plays["gameId"] <= gameMax]
modPlays.to_pickle(modPicklePath)
plays = modPlays
def isolatePlay(data, gameNum, playNum):
MAX_X_YARDS = 120
MAX_Y_YARDS = 53.3
d = data[data['gameId'] == gameNum]
d = d[d['playId'] == playNum].fillna(0)
#normalize x ,y...
sub = d[["x","y", "s", "a", "dis", "o", "dir"]].to_numpy()
norm = Normalizer().fit(sub)
return norm.transform(sub)
print("creating ML training, test, and validation datasets")
first = True
for rows in plays.itertuples():
#print(getattr(rows, 'gameId'), gameMax)
play = isolatePlay(week, getattr(rows, 'gameId'), getattr(rows, 'playId'))
if (first):
x = [play]
y = [[getattr(rows, 'offensePlayResult')]]
first = False
else:
x.append(play)
y.append([getattr(rows, 'offensePlayResult')])
train_x, test_x, train_y, test_y = train_test_split(np.array(x), np.array(y), test_size=0.3)
test_x, val_x, test_y, val_y = train_test_split(test_x, test_y, test_size=0.5)
print("x data:[0]", train_x[0])
print("x data:[1]", train_x[1])
print("ML Dataset Preparation Complete")
# create the model
embedding_vecor_length = 32
model = Sequential()
model.add(LSTM(100))
model.add(Dense(1, activation='sigmoid'))
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
model.fit(train_x, train_y, validation_data=(val_x, val_y), epochs=3, batch_size=64)
print(model.summary())
# Final evaluation of the model
scores = model.evaluate(test_x, test_y, verbose=0)
print("Accuracy: %.2f%%" % (scores[1]*100))
完整的追溯:
Traceback (most recent call last):
File "c:/Users/benja/source/repos/NFL/nflModelTest2.py", line 128, in <module>
model.fit(train_x, train_y, validation_data=(val_x, val_y), epochs=3, batch_size=64)
File "C:\Users\benja\AppData\Local\Programs\Python\Python38\lib\site-packages\tensorflow\python\keras\engine\training.py", line 108, in _method_wrapper
return method(self, *args, **kwargs)
File "C:\Users\benja\AppData\Local\Programs\Python\Python38\lib\site-packages\tensorflow\python\keras\engine\training.py", line 1049, in fit
data_handler = data_adapter.DataHandler(
File "C:\Users\benja\AppData\Local\Programs\Python\Python38\lib\site-packages\tensorflow\python\keras\engine\data_adapter.py", line 1105, in __init__
self._adapter = adapter_cls(
File "C:\Users\benja\AppData\Local\Programs\Python\Python38\lib\site-packages\tensorflow\python\keras\engine\data_adapter.py", line 265, in __init__
x, y, sample_weights = _process_tensorlike((x, y, sample_weights))
File "C:\Users\benja\AppData\Local\Programs\Python\Python38\lib\site-packages\tensorflow\python\keras\engine\data_adapter.py", line 1021, in _process_tensorlike
inputs = nest.map_structure(_convert_numpy_and_scipy, inputs)
File "C:\Users\benja\AppData\Local\Programs\Python\Python38\lib\site-packages\tensorflow\python\util\nest.py", line 635, in map_structure
structure[0], [func(*x) for x in entries],
File "C:\Users\benja\AppData\Local\Programs\Python\Python38\lib\site-packages\tensorflow\python\util\nest.py", line 635, in <listcomp>
structure[0], [func(*x) for x in entries],
File "C:\Users\benja\AppData\Local\Programs\Python\Python38\lib\site-packages\tensorflow\python\keras\engine\data_adapter.py", line 1016, in _convert_numpy_and_scipy
return ops.convert_to_tensor(x, dtype=dtype)
File "C:\Users\benja\AppData\Local\Programs\Python\Python38\lib\site-packages\tensorflow\python\framework\ops.py", line 1499, in convert_to_tensor
ret = conversion_func(value, dtype=dtype, name=name, as_ref=as_ref)
File "C:\Users\benja\AppData\Local\Programs\Python\Python38\lib\site-packages\tensorflow\python\framework\tensor_conversion_registry.py", line 52, in _default_conversion_function
return constant_op.constant(value, dtype, name=name)
File "C:\Users\benja\AppData\Local\Programs\Python\Python38\lib\site-packages\tensorflow\python\framework\constant_op.py", line 263, in constant
return _constant_impl(value, dtype, shape, name, verify_shape=False,
File "C:\Users\benja\AppData\Local\Programs\Python\Python38\lib\site-packages\tensorflow\python\framework\constant_op.py", line 275, in _constant_impl
return _constant_eager_impl(ctx, value, dtype, shape, verify_shape)
File "C:\Users\benja\AppData\Local\Programs\Python\Python38\lib\site-packages\tensorflow\python\framework\constant_op.py", line 300, in _constant_eager_impl
t = convert_to_eager_tensor(value, ctx, dtype)
File "C:\Users\benja\AppData\Local\Programs\Python\Python38\lib\site-packages\tensorflow\python\framework\constant_op.py", line 98, in convert_to_eager_tensor
return ops.EagerTensor(value, ctx.device_name, dtype)
ValueError: Failed to convert a NumPy array to a Tensor (Unsupported object type numpy.ndarray).
【问题讨论】:
-
拍摄区域here。我的猜测是输入尺寸不正确。它在寻找 3 维时显示为 1 维。输入应采用[样本、时间步长、特征]的特定数组结构形式。
-
将我的数据以[samples, time step, features]格式放置,1次播放的跟踪数据将以码数结果作为样本的特征,对吗?在那种情况下,我会只使用一些播放/递减计数器来计算时间步长吗? @chitown88
-
Ben,我能问一下,你想训练/预测什么?我知道你说的是比赛的预测码数结果,但这不只是简单地计算开球线和比赛结束之间的差异吗? (即 x 坐标结束 - x 坐标开始)?无需创建深度学习模型即可做到这一点。除非我不理解你的想法/问题。
标签: python numpy machine-learning keras tensor