【发布时间】:2026-01-08 21:15:03
【问题描述】:
让我们考虑一下来自 Kaggle (https://www.kaggle.com/pavansubhasht/ibm-hr-analytics-attrition-dataset) 的 IBM HR Attrition Dataset。如何快速获取夏皮罗 p 值最高的变量?
换句话说,我可以将函数shapiro() 在列中应用为shapiro(df['column'])。我想计算这些函数的所有数字列。
我试过了:
from scypy.stats import shapiro
df = pd.read_csv('path')
#here i was expecting the output to be a sequential prints with the name of the columns and their respective p-value from shapiro()
for col in hr:
print(col," : ", shapiro(hr[col])[0])
有谁可以帮忙解决这个问题?
提前致谢。
【问题讨论】:
标签: python pandas scipy statistics