【发布时间】:2020-01-22 15:58:47
【问题描述】:
我正在使用以下代码打印缺失值计数和列名。
#Looking for missing data and then handling it accordingly
def find_missing(data):
# number of missing values
count_missing = data_final.isnull().sum().values
# total records
total = data_final.shape[0]
# percentage of missing
ratio_missing = count_missing/total
# return a dataframe to show: feature name, # of missing and % of missing
return pd.DataFrame(data={'missing_count':count_missing, 'missing_ratio':ratio_missing},
index=data.columns.values)
find_missing(data_final).head(5)
我想要做的是只打印那些缺少值的列,因为我有一个大约 150 列的庞大数据集。
数据集是这样的
A B C D
123 ABC X Y
123 ABC X Y
NaN ABC NaN NaN
123 ABC NaN NaN
245 ABC NaN NaN
345 ABC NaN NaN
在输出中我只想看到:
missing_count missing_ratio
C 4 0.66
D 4 0.66
而不是列 A 和 B,因为那里没有缺失值
【问题讨论】:
标签: python-3.x pandas missing-data