Pandas - 对相关表中的列进行排序答案

【问题标题】：Pandas - sort columns in correlation tablePandas - 对相关表中的列进行排序
【发布时间】：2021-10-10 01:26:10
【问题描述】：

我正在对我的数据集（来自 Excel）运行 pearson 相关性，这是结果出来的顺序：

我想知道是否可以将 n_hhld_trip 作为我的第一列，因为它是我的因变量。

以下是我到目前为止的代码，但不知道如何让它反映我想要的更改。我尝试在数据透视表命令中移动变量，但没有这样做：

zone_sum_mean_combo = pd.pivot_table(
    read_excel,
    index=['Zone'],
    aggfunc={'Household ID': np.mean, 'dwtype': np.mean, 'n_hhld_trip': np.sum,
             'expf': np.mean, 'n_emp_ft': np.sum, 'n_emp_home': np.sum,
             'n_emp_pt': np.sum, 'n_lic': np.sum, 'n_pers': np.sum,
             'n_student': np.sum, 'n_veh': np.sum}
)

index_reset = zone_sum_mean_combo.reset_index()
print(index_reset)

pearson_correlation = index_reset.corr(method='pearson')
print(pearson_correlation)

【问题讨论】：

标签： python pandas numpy linear-regression pearson-correlation

【解决方案1】：

有时在完成所有操作后对列顺序进行硬编码会更容易：

df = df[["my_first_column", "my_second_column"]]

在你的情况下，我认为操纵它们更容易：

columns = list(df.columns)
columns.remove("n_hhld_trip")
columns.insert(0, "n_hhld_trip")
df = df[columns]

【讨论】：

【解决方案2】：

尝试set_index和reset_index：

df.set_index('n_hhld_trip', append=True).reset_index(level=-1)

【讨论】：