【问题标题】:replacing all NAs in a list of dataframes R替换数据帧列表中的所有 NA R
【发布时间】:2021-11-05 19:47:26
【问题描述】:

我有一个数据框列表,下面是一个示例。

list(Al2O3 = structure(list(Determination_No = c(1, 2, 3, 4, 
5, 6, 7, 8, 9, 10), `2` = c(2.04, 2.07, 2.05, 2.07, 2.1, 2.08, 
NA, NA, NA, NA), `3` = c(2.08, 2.1, 2.08, 2.13, 2.1, 2.08, NA, 
NA, NA, NA), `4` = c(2.08, 2.08, 2.09, 2.06, 2.08, 2.07, 2.07, 
2.06, 2.08, 2.08), `5` = c(2.11, 2.09, 2.1, 2.08, 2.09, 2.09, 
NA, NA, NA, NA), `6` = c(2.12, 2.1, 2.1, 2.11, 2.1, 2.11, NA, 
NA, NA, NA), `7` = c(2.06, 2.05, 2.04, 2.05, 2.04, 2.03, NA, 
NA, NA, NA), `8` = c(2.078, 2.065, 2.057, 2.063, 2.067, 2.066, 
NA, NA, NA, NA), `10` = c(2.191776681, 2.153987428, 2.153987428, 
2.097303548, 2.116198175, 2.116198175, NA, NA, NA, NA), `12` = c(2.24, 
2.08, 2.12, 2.15, 2.15, 2.15, NA, NA, NA, NA), `36` = c(2.07, 
2.082, 2.048, 2.046, 2.086, 2.069, NA, NA, NA, NA)), class = "data.frame", row.names = c(NA, 
-10L)), As = structure(list(Determination_No = c(1, 2, 3, 4, 
5, 6, 7, 8, 9, 10), `2` = c(0.002, 0.001, 0.001, 0.001, 0.002, 
0.001, NA, NA, NA, NA), `3` = c(0.003, 0.002, 0.002, 0.002, 0.001, 
0.002, NA, NA, NA, NA), `4` = c(0.001, 0.002, 0.001, 0.002, 0.002, 
0.002, 0.001, 0.002, 0.002, 0.003), `5` = c(0.002, 0.001, 0.001, 
0.001, 0.001, 0.002, NA, NA, NA, NA), `6` = c(NA_real_, NA_real_, 
NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, 
NA_real_), `7` = c(NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, 
NA_real_, NA_real_, NA_real_, NA_real_, NA_real_), `8` = c(NA, 
0.001, NA, NA, NA, NA, NA, NA, NA, NA), `10` = c(NA_real_, NA_real_, 
NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, 
NA_real_), `12` = c(NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, 
NA_real_, NA_real_, NA_real_, NA_real_, NA_real_), `36` = c(0.0053, 
0.0053, 0.0053, 0.00454, 0.0053, 0.0053, NA, NA, NA, NA)), class = "data.frame", row.names = c(NA, 
-10L)), Ba = structure(list(Determination_No = c(1, 2, 3, 4, 
5, 6, 7, 8, 9, 10), `2` = c(NA_real_, NA_real_, NA_real_, NA_real_, 
NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_), 
    `3` = c(NA, NA, NA, NA, 0.001, NA, NA, NA, NA, NA), `4` = c(0.004, 
    0.003, 0.003, 0.004, 0.003, 0.002, 0.004, 0.002, 0.005, NA
    ), `5` = c(NA, NA, NA, NA, NA, 0.003, NA, NA, NA, NA), `6` = c(NA_real_, 
    NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, 
    NA_real_, NA_real_, NA_real_), `7` = c(NA_real_, NA_real_, 
    NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, 
    NA_real_, NA_real_), `8` = c(0.002, 0.003, NA, NA, NA, 0.002, 
    NA, NA, NA, NA), `10` = c(NA_real_, NA_real_, NA_real_, NA_real_, 
    NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_
    ), `12` = c(NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, 
    NA_real_, NA_real_, NA_real_, NA_real_, NA_real_), `36` = c(0.00089566, 
    0.00089566, 0.00089566, 0.00089566, 0.00089566, 0.00089566, 
    NA, NA, NA, NA)), class = "data.frame", row.names = c(NA, 
-10L)))


有很多 NA 和 NaN。 我的数据框现在主要用于显示目的,我想删除 NA、NaN 并更改为空白记录/空间。

我尝试了以下但没有成功

lapply(df.P, is.na) <- ""  
is.na[(df.P)] <- ""
Map[is.na(df.P)] <- "" 

我倾向于收到以下错误消息 “闭包”类型的对象不是子集

任何帮助表示赞赏

【问题讨论】:

  • 另一个lapply(df.P, function(x) replace(x, is.na(x), ''))

标签: r dataframe lapply


【解决方案1】:

lapply 使用的语法是 -

df <- lapply(df, function(x) {x[is.na(x)] <- '';x})
df

#$Al2O3
#   Determination_No    2    3    4    5    6    7     8          10   12    36
#1                 1 2.04 2.08 2.08 2.11 2.12 2.06 2.078 2.191776681 2.24  2.07
#2                 2 2.07  2.1 2.08 2.09  2.1 2.05 2.065 2.153987428 2.08 2.082
#3                 3 2.05 2.08 2.09  2.1  2.1 2.04 2.057 2.153987428 2.12 2.048
#4                 4 2.07 2.13 2.06 2.08 2.11 2.05 2.063 2.097303548 2.15 2.046
#5                 5  2.1  2.1 2.08 2.09  2.1 2.04 2.067 2.116198175 2.15 2.086
#6                 6 2.08 2.08 2.07 2.09 2.11 2.03 2.066 2.116198175 2.15 2.069
#7                 7           2.07                                            
#8                 8           2.06                                            
#9                 9           2.08                                            
#10               10           2.08    
#...                                        
#...

【讨论】:

  • 这运行没有错误,但没有任何变化。是否与 NA 和 NA_real_ 有关?
  • 不,我不这么认为。它在共享数据上对我来说很好,并将NA 替换为''。我已经更新了答案以显示我得到的输出。请注意,我使用的是df,您可能已将您的对象命名为df.P
【解决方案2】:

编写一个小的Vectorized 函数,将replaces NA'' 结合起来。

is.nav <- Vectorize(\(x) replace(x, is.na(x), ''), SIMPLIFY=F)
is.nav(lst)
# $Al2O3
# Determination_No    2    3    4    5    6    7     8          10   12    36
# 1                 1 2.04 2.08 2.08 2.11 2.12 2.06 2.078 2.191776681 2.24  2.07
# 2                 2 2.07  2.1 2.08 2.09  2.1 2.05 2.065 2.153987428 2.08 2.082
# 3                 3 2.05 2.08 2.09  2.1  2.1 2.04 2.057 2.153987428 2.12 2.048
# 4                 4 2.07 2.13 2.06 2.08 2.11 2.05 2.063 2.097303548 2.15 2.046
# 5                 5  2.1  2.1 2.08 2.09  2.1 2.04 2.067 2.116198175 2.15 2.086
# 6                 6 2.08 2.08 2.07 2.09 2.11 2.03 2.066 2.116198175 2.15 2.069
# 7                 7           2.07                                            
# 8                 8           2.06                                            
# 9                 9           2.08                                            
# 10               10           2.08                                            
# 
# $As
# Determination_No     2     3     4     5 6 7     8 10 12      36
# 1                 1 0.002 0.003 0.001 0.002                  0.0053
# 2                 2 0.001 0.002 0.002 0.001     0.001        0.0053
# 3                 3 0.001 0.002 0.001 0.001                  0.0053
# 4                 4 0.001 0.002 0.002 0.001                 0.00454
# 5                 5 0.002 0.001 0.002 0.001                  0.0053
# 6                 6 0.001 0.002 0.002 0.002                  0.0053
# 7                 7             0.001                              
# 8                 8             0.002                              
# 9                 9             0.002                              
# 10               10             0.003                              
# 
# $Ba
# Determination_No 2     3     4     5 6 7     8 10 12         36
# 1                 1         0.004           0.002       0.00089566
# 2                 2         0.003           0.003       0.00089566
# 3                 3         0.003                       0.00089566
# 4                 4         0.004                       0.00089566
# 5                 5   0.001 0.003                       0.00089566
# 6                 6         0.002 0.003     0.002       0.00089566
# 7                 7         0.004                                 
# 8                 8         0.002                                 
# 9                 9         0.005                                 
# 10               10                                               

【讨论】:

    【解决方案3】:

    我为此编写了一个函数。

    my.na_replace=function(s, replacedValue=0)
    {
    for (i in 1:nrow(s))
    {
        for (j in 1:ncol(s))
        {
            if (is.na(s[i,j])==TRUE)    s[i,j]=replacedValue
            
        }
    }
    return(s)
    }
    

    【讨论】:

      【解决方案4】:

      你可以试试

      lapply(names(df.P), function(d) {df.P[[d]][is.na(df.P[[d]])] <- ""; df.P[[d]] })
      

      【讨论】:

        猜你喜欢
        • 1970-01-01
        • 2021-06-27
        • 1970-01-01
        • 1970-01-01
        • 2017-01-30
        • 1970-01-01
        • 2021-08-30
        • 1970-01-01
        • 1970-01-01
        相关资源
        最近更新 更多