【发布时间】:2020-09-07 03:52:22
【问题描述】:
我正在研究从 NumPy 数组中形成大约 60 行和 11 列的矩阵。我研究了几种方法,但我无法让它发挥作用。我尝试了以下代码并得到了这个错误,
stats_features_full = np.empty((0, 11))
for ls in range(60):
current_list = ls
print('Entering list {0} for feature extraction'.format(current_list))
stats_features = get_selected_statistics_features(list_values=list[ls])
stats_features_np_shape = np.array(stats_features).shape
print('Statistical Features Extracted from list: ', stats_features)
print('Statistical Features Shape Extracted from list: ', stats_features_np_shape)
stats_features_full = np.concatenate([stats_features_full, np.array(stats_features)], axis=0)
# stats_features_full = np.append(arr=stats_features_full, values=np.array(stats_features), axis=0)
stats_features_full_np_shape = np.array(stats_features_full).shape
print('Statistical Features Extracted from all lists: ', stats_features_full)
print('Statistical Features Shape Extracted from all lists: ', stats_features_full_np_shape)
错误信息:
(1)
stats_features_full = np.concatenate([stats_features_full, np.array(stats_features)], axis=0)
File "<__array_function__ internals>", line 6, in concatenate
ValueError: all the input arrays must have same number of dimensions, but the array at index 0 has 2 dimension(s) and the array at index 1 has 1 dimension(s)
(2)
print('Entering list {0} for feature extraction'.format(current_list))
File "<__array_function__ internals>", line 6, in append
return concatenate((arr, values), axis=axis)
File "<__array_function__ internals>", line 6, in concatenate
ValueError: all the input arrays must have same number of dimensions, but the array at index 0 has 2 dimension(s) and the array at index 1 has 1 dimension(s)
有没有办法创建一个 60x11 的数组?
编辑 1:
感谢@Krish,它似乎工作正常。我还有一个问题,我想将 stats_features_full 变量转换为 pandas 数据框,以便将结果保存为文本文件。我该如何解决这个问题?请参阅下面的方法:
########################################################################################################################
########################################################################################################################
############################################### Feature Datasets #######################################################
########################################################################################################################
########################################################################################################################
Stats_DataFrame_Feature = stats_features_full
Stats_DataFrame_Feature_Data_list = list(Stats_DataFrame_Feature)
# print('Statistical DataFrame Featureset list: ', Stats_DataFrame_Feature_Data_list)
Stats_DataFrame_Feature_Data_list_shape = np.array(Stats_DataFrame_Feature_Data_list).shape
Stats_DataFrame_Feature_Data_list_shape_1 = np.array(Stats_DataFrame_Feature_Data_list).shape
print('Statistical DataFrame Featureset list shape: ', Stats_DataFrame_Feature_Data_list_shape)
print('Statistical DataFrame Featureset list shape: ', Stats_DataFrame_Feature_Data_list_shape_1[0])
for Stat_row in range(60):
StatsData.append(Stats_DataFrame_Feature[0:Stats_DataFrame_Feature_Data_list_shape[0]])
StatsData_np = np.array(StatsData)
with open('filepath\dataset.txt', 'w') as out_file:
for i in range(60):
print('Opened file number: {0}'.format(i))
out_string = ""
out_string += pd.DataFrame(data=StatsData_np).to_string()
out_file.write(out_string)
break
# break
# break
Stats_DataFrame_Feature_Matrix = StatsData
print('Final Saved Statistical Feature Dataset file: ', Stats_DataFrame_Feature_Matrix)
print('Shape Final Saved Statistical Feature Dataset file: ', np.array(Stats_DataFrame_Feature_Matrix).shape)
我的错误信息:
out_string += pd.DataFrame(data=StatsData_np).to_string()
mgr = init_ndarray(data, index, columns, dtype=dtype, copy=copy)
values = prep_ndarray(values, copy=copy)
raise ValueError("Must pass 2-d input")
ValueError: Must pass 2-d input
编辑 2:
我设法通过更改以下几行使其工作,
StatsData.append(Stats_DataFrame_Feature[0:Stats_DataFrame_Feature_Data_list_shape[1]])
StatsData_np = np.array(StatsData[Stat_row])
但是,我保存的文件遵循尺寸 (60,11,11)。这是为什么呢?
编辑 3:
假设我创建了 6 个字典键,每个键有 10 个列表。我想实现同样的事情,但我不断收到索引错误。
for key in range(0, 6, 1):
list_key = np.array(dict_list[key])
print('Key value: {0}'.format(key))
arr_list = []
for list_num in range(0, 10, 1):
list_val_num = np.array(list(list_key[:, list_num][0]))
# stats_features = get_statistics_features_final(list_values=list_val_num)
stats_features = get_selected_statistics_features(list_values=list_val_num)
stats_features_np_shape = np.array(stats_features).shape
print('Statistical Features Extracted from list: ', stats_features)
print('Statistical Features Shape Extracted from list: ', stats_features_np_shape)
arr_list += [stats_features]
f_arr_list += arr_list
stats_features_full = np.vstack(f_arr_list)
stats_features_full_np_shape = np.array(stats_features_full).shape
print('Statistical Features Shape Extracted from all lists: ', stats_features_full_np_shape)
错误信息:
IndexError: index 1 is out of bounds for axis 1 with size 1
【问题讨论】:
-
您了解错误吗?
concatenate对其论点的维度很挑剔。你知道二维和一维的区别,对吧?也就是说,在循环中重复concatenate是一种低效的构建数组的方式,而且您发现很难做到正确。 -
您将
stats_features_full定义为2d,对吗?但是np.array(stats_features)是什么?错误表明它是 1d,但concatenate预计它是 2d。 (1,11) 的形状可以很好地与原始 (0,11) 配合使用。 -
np.vstack如答案中推荐的那样类似于concatenate,但它确保所有参数至少为 2d,因此可以作为行连接。两者都使用数组列表。 -
@hpaulj 好吧,我将代码更改为 vstack 并设法让它工作。我还有一个问题,请查看编辑 1。
标签: python arrays pandas numpy matrix