【发布时间】:2015-12-13 14:00:44
【问题描述】:
问题是如何用 Pandas 数据框中类别列最频繁的级别填充 NaN?
在 R randomForest 包中有
na.roughfix 选项:A completed data matrix or data frame. For numeric variables, NAs are replaced with column medians. For factor variables, NAs are replaced with the most frequent levels (breaking ties at random). If object contains no NAs, it is returned unaltered.
在 Pandas 中,对于数值变量,我可以用 :
填充 NaN 值df = df.fillna(df.median())
【问题讨论】: