【问题标题】:How to fill in missing value of a data.frame in R?如何在 R 中填写 data.frame 的缺失值?
【发布时间】:2021-02-24 00:04:13
【问题描述】:

我有多个columns,其中有missing values。我想在所有年份中使用同一天的mean,而filling 使用每个columnmissing 数据。例如,DF 是我的假数据,我在其中看到 two columns (A & X)missing

library(lubridate)
library(tidyverse)
library(naniar)

set.seed(123)

DF <- data.frame(Date = seq(as.Date("1985-01-01"), to = as.Date("1987-12-31"), by = "day"),
                 A = sample(1:10,1095, replace = T), X = sample(5:15,1095, replace = T)) %>% 
                replace_with_na(replace = list(A = 2, X = 5))

Column A中的fill,我使用下面的代码

Fill_DF_A <- DF %>% 
          mutate(Year = year(Date), Month = month(Date), Day = day(Date)) %>% 
          group_by(Year, Day) %>% 
  mutate(A = ifelse(is.na(A), mean(A, na.rm=TRUE), A))

我的data.frame 中有很多columns,我想将其概括为所有columns 以填补缺失值?

【问题讨论】:

    标签: r dataframe tidyverse na missing-data


    【解决方案1】:

    我们可以从zoo使用na.aggregate

    library(dplyr)
    library(zoo)
    DF %>% 
      mutate(Year = year(Date), Month = month(Date), Day = day(Date)) %>% 
          group_by(Year, Day)  %>%
         mutate(across(A:X, na.aggregate))
    

    或者如果我们更喜欢使用条件语句

    DF %>% 
      mutate(Year = year(Date), Month = month(Date), Day = day(Date)) %>% 
      group_by(Year, Day)  %>%
      mutate(across(A:X, ~ case_when(is.na(.) 
                     ~ mean(., na.rm = TRUE), TRUE ~ as.numeric(.))))  
    

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 1970-01-01
      • 2021-10-12
      • 2021-12-26
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      相关资源
      最近更新 更多