【问题标题】:How to create NxM matrix from numpy array?如何从 numpy 数组创建 NxM 矩阵?
【发布时间】:2020-09-07 03:52:22
【问题描述】:

我正在研究从 NumPy 数组中形成大约 60 行和 11 列的矩阵。我研究了几种方法,但我无法让它发挥作用。我尝试了以下代码并得到了这个错误,

stats_features_full = np.empty((0, 11))
for ls in range(60):
    current_list = ls
    print('Entering list {0} for feature extraction'.format(current_list))
    stats_features = get_selected_statistics_features(list_values=list[ls])
    stats_features_np_shape = np.array(stats_features).shape
    print('Statistical Features Extracted from list: ', stats_features)
    print('Statistical Features Shape Extracted from list: ', stats_features_np_shape)
    stats_features_full = np.concatenate([stats_features_full, np.array(stats_features)], axis=0)
    # stats_features_full = np.append(arr=stats_features_full, values=np.array(stats_features), axis=0)
    stats_features_full_np_shape = np.array(stats_features_full).shape
    print('Statistical Features Extracted from all lists: ', stats_features_full)
    print('Statistical Features Shape Extracted from all lists: ', stats_features_full_np_shape)

错误信息:

(1)

stats_features_full = np.concatenate([stats_features_full, np.array(stats_features)], axis=0)
  File "<__array_function__ internals>", line 6, in concatenate
ValueError: all the input arrays must have same number of dimensions, but the array at index 0 has 2 dimension(s) and the array at index 1 has 1 dimension(s)

(2)

print('Entering list {0} for feature extraction'.format(current_list))
  File "<__array_function__ internals>", line 6, in append
return concatenate((arr, values), axis=axis)
  File "<__array_function__ internals>", line 6, in concatenate
ValueError: all the input arrays must have same number of dimensions, but the array at index 0 has 2 dimension(s) and the array at index 1 has 1 dimension(s)

有没有办法创建一个 60x11 的数组?

编辑 1:

感谢@Krish,它似乎工作正常。我还有一个问题,我想将 stats_features_full 变量转换为 pandas 数据框,以便将结果保存为文本文件。我该如何解决这个问题?请参阅下面的方法:

    ########################################################################################################################
########################################################################################################################
############################################### Feature Datasets #######################################################
########################################################################################################################
########################################################################################################################
Stats_DataFrame_Feature = stats_features_full
Stats_DataFrame_Feature_Data_list = list(Stats_DataFrame_Feature)
# print('Statistical DataFrame Featureset list: ', Stats_DataFrame_Feature_Data_list)
Stats_DataFrame_Feature_Data_list_shape = np.array(Stats_DataFrame_Feature_Data_list).shape
Stats_DataFrame_Feature_Data_list_shape_1 = np.array(Stats_DataFrame_Feature_Data_list).shape
print('Statistical DataFrame Featureset list shape: ', Stats_DataFrame_Feature_Data_list_shape)
print('Statistical DataFrame Featureset list shape: ', Stats_DataFrame_Feature_Data_list_shape_1[0])


for Stat_row in range(60):
    StatsData.append(Stats_DataFrame_Feature[0:Stats_DataFrame_Feature_Data_list_shape[0]])
    StatsData_np = np.array(StatsData)
    with open('filepath\dataset.txt', 'w') as out_file:
        for i in range(60):
            print('Opened file number: {0}'.format(i))
            out_string = ""
            out_string += pd.DataFrame(data=StatsData_np).to_string()
            out_file.write(out_string)
            break
        # break
    # break

Stats_DataFrame_Feature_Matrix = StatsData
print('Final Saved Statistical Feature Dataset file: ', Stats_DataFrame_Feature_Matrix)
print('Shape Final Saved Statistical Feature Dataset file: ', np.array(Stats_DataFrame_Feature_Matrix).shape)

我的错误信息:

out_string += pd.DataFrame(data=StatsData_np).to_string()
mgr = init_ndarray(data, index, columns, dtype=dtype, copy=copy)
values = prep_ndarray(values, copy=copy)
raise ValueError("Must pass 2-d input")
ValueError: Must pass 2-d input

编辑 2:

我设法通过更改以下几行使其工作,

StatsData.append(Stats_DataFrame_Feature[0:Stats_DataFrame_Feature_Data_list_shape[1]])
StatsData_np = np.array(StatsData[Stat_row])

但是,我保存的文件遵循尺寸 (60,11,11)。这是为什么呢?

编辑 3:

假设我创建了 6 个字典键,每个键有 10 个列表。我想实现同样的事情,但我不断收到索引错误。

for key in range(0, 6, 1):
    list_key = np.array(dict_list[key])
    print('Key value: {0}'.format(key))
    arr_list = []
    for list_num in range(0, 10, 1):
        list_val_num = np.array(list(list_key[:, list_num][0]))
        # stats_features = get_statistics_features_final(list_values=list_val_num)
        stats_features = get_selected_statistics_features(list_values=list_val_num)
        stats_features_np_shape = np.array(stats_features).shape
        print('Statistical Features Extracted from list: ', stats_features)
        print('Statistical Features Shape Extracted from list: ', stats_features_np_shape)
        arr_list += [stats_features]
f_arr_list += arr_list
stats_features_full = np.vstack(f_arr_list)
stats_features_full_np_shape = np.array(stats_features_full).shape
print('Statistical Features Shape Extracted from all lists: ', stats_features_full_np_shape)

错误信息:

IndexError: index 1 is out of bounds for axis 1 with size 1

【问题讨论】:

  • 您了解错误吗? concatenate 对其论点的维度很挑剔。你知道二维和一维的区别,对吧?也就是说,在循环中重复 concatenate 是一种低效的构建数组的方式,而且您发现很难做到正确。
  • 您将stats_features_full 定义为2d,对吗?但是np.array(stats_features) 是什么?错误表明它是 1d,但 concatenate 预计它是 2d。 (1,11) 的形状可以很好地与原始 (0,11) 配合使用。
  • np.vstack 如答案中推荐的那样类似于concatenate,但它确保所有参数至少为 2d,因此可以作为行连接。两者都使用数组列表。
  • @hpaulj 好吧,我将代码更改为 vstack 并设法让它工作。我还有一个问题,请查看编辑 1。

标签: python arrays pandas numpy matrix


【解决方案1】:

我用np.random.rand(11)代替数据:

import numpy as np
arr_list = []
for ls in range(60):
    stats_features_np_shape = np.random.rand(11)
    arr_list += [stats_features_np_shape]

stats_features_full = np.vstack(arr_list)
print(stats_features_full)

关键不是这里是stats_features_np_shape 的形状应该是(11)(或任何整数),而stats_features_full 最好在循环之外生成。

【讨论】:

  • 谢谢你,它似乎工作。我还有另一个问题,我想将 stats_features_full 变量转换为 pandas 数据框,以便将其保存为 txt 文件。有关详细信息,请参阅相关编辑 1。
  • Pandas 有点超出我的舒适区,但我强烈建议阅读 numpy 和 pandas 指南 - 这可能是我迄今为止看到的最好的资源:@​​987654321@
  • 我还有一个问题,请查看更新后的问题(编辑 3)
  • 你最好创建一个单独的问题来回答这个问题,@WDpad159
【解决方案2】:

感谢@Krish 和@hpaulj,我设法解决了这个问题。在下面找到完整的代码:

########################################################################################################################
########################################################################################################################
############################################ Feature Extraction ########################################################
########################################################################################################################
########################################################################################################################
arr_list = []
for ls in range(60):
    current_list = ls
    print('Entering list {0} for feature extraction'.format(current_list))
    stats_features = get_selected_statistics_features(list_values=list[ls])
    arr_list += [stats_features]
    stats_features_np_shape = np.array(stats_features).shape
    print('Statistical Features Extracted from list: ', stats_features)
    print('Statistical Features Shape Extracted from list: ', stats_features_np_shape)
stats_features_full = np.vstack(arr_list)
stats_features_full_np_shape = np.array(stats_features_full).shape
print('Statistical Features Shape Extracted from all lists: ', stats_features_full_np_shape)

########################################################################################################################
########################################################################################################################
############################################### Feature Datasets #######################################################
########################################################################################################################
########################################################################################################################
Stats_DataFrame_Feature = stats_features_full
Stats_DataFrame_Feature_Data_list = list(Stats_DataFrame_Feature)
print('Statistical DataFrame Featureset list: ', Stats_DataFrame_Feature_Data_list)
Stats_DataFrame_Feature_Data_list_shape = np.array(Stats_DataFrame_Feature_Data_list).shape
print('Statistical DataFrame Featureset list shape: ', Stats_DataFrame_Feature_Data_list_shape)

for Stat_row in range(60):
    StatsData.append(Stats_DataFrame_Feature[0:Stats_DataFrame_Feature_Data_list_shape[0]])
    Stats_DataFrame_Feature_Data_list_list = Stats_DataFrame_Feature_Data_list[0:Stats_DataFrame_Feature_Data_list_shape[0]]
    StatsData_np = np.array(StatsData[Stat_row])
    with open('filepath\list dataset.txt', 'w') as out_file:
        for i in range(60):
            print('Opened file number: {0}'.format(i))
            out_string = ""
            out_string += pd.DataFrame(data=StatsData_np).to_string()
            out_file.write(out_string)
            break
Stats_DataFrame_Feature_Matrix = StatsData
print('Shape Final Saved Statistical Feature Dataset file: ', np.array(Stats_DataFrame_Feature_Matrix).shape)
StatsData_np = np.array(StatsData)

column_no = 0
StatsData = []
stats_features_full = np.array([])

【讨论】:

    猜你喜欢
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2018-06-29
    • 1970-01-01
    • 2017-12-17
    相关资源
    最近更新 更多