【问题标题】:Remove row/columns with any/all NaN values删除任何/所有 NaN 值的行/列
【发布时间】:2017-05-11 15:58:18
【问题描述】:

在带有 pandas 的 python 中,我可以执行以下操作:

# Drop columns with ANY missing values
df2 = df.dropna(axis=1, how="any")

# Drop columns with ALL missing values
df2 = df.dropna(axis=1, how="all")

# Drop rows with ANY missing values
df2 = df.dropna(axis=0, how="any")

# Drop rows with ALL missing values
df2 = df.dropna(axis=0, how="all")

我如何类似地过滤 R data.table 中的行/列?

【问题讨论】:

    标签: python r pandas filter data.table


    【解决方案1】:

    我们可以将Reduce|& 一起使用

    library(data.table)
    #Drop rows with any missing values
    setDT(df1)[df1[, !Reduce(`|`, lapply(.SD, is.na))]]
    #Drop rows with all missing values 
    setDT(df1)[df1[, !Reduce(`&`, lapply(.SD, is.na))]]
    
    #Drop columns with any and all missing values
    Filter(function(x) !any(is.na(x)), df1)
    Filter(function(x) !all(is.na(x)), df1)
    #or use
    setDT(df1)[, unlist(df1[, lapply(.SD, function(x) any(!is.na(x)))]), with = FALSE]
    setDT(df1)[, unlist(df1[, lapply(.SD, function(x) all(!is.na(x)))]), with = FALSE]      
    

    数据

    set.seed(24)
    df1 <- as.data.table(matrix(sample(c(NA, 0:5), 4*5, replace=TRUE), ncol=4))
    df1[3] <- NA
    

    【讨论】:

      最近更新 更多