根据 R 中性别的分布类型替换 NA答案

【问题标题】：Replacing NA depending on distribution type of gender in R根据 R 中性别的分布类型替换 NA
【发布时间】：2018-07-13 13:47:37
【问题描述】：

当我在这里选择 NA 值时

data[data=="na"] <- NA
data[!complete.cases(data),]

我必须替换它，但取决于分发类型。如果使用 Shapiro.test 变量分布不正常，那么缺失值必须用中位数代替，如果是正常的，不如平均更换。但是每个性别的分布（1个女孩，2个男人）

data=structure(list(sex = c(1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L), emotion = c(20L, 
15L, 49L, NA, 34L, 35L, 54L, 45L), IQ = c(101L, 98L, 105L, NA, 
123L, 120L, 115L, NA)), .Names = c("sex", "emotion", "IQ"), class = "data.frame", row.names = c(NA, 
-8L))

想要的输出

sex emotion IQ
1   20  101
1   15  98
1   49  105
1   28  101
2   34  123
2   35  120
2   54  115
2   45  119

【问题讨论】：

标签： r dplyr plyr na

【解决方案1】：

以下代码将根据夏皮罗测试替换 NA 值：

library(dplyr)

data %>% 
 group_by(sex) %>%
 mutate(
  emotion = ifelse(!is.na(emotion), emotion,
   ifelse(shapiro.test(emotion)$p.value > 0.05,
    mean(emotion, na.rm=TRUE), quantile(emotion, na.rm=TRUE, probs=0.5) ) ),
  IQ = ifelse(!is.na(IQ), IQ,
   ifelse(shapiro.test(IQ)$p.value > 0.05,
    mean(IQ, na.rm=TRUE), quantile(IQ, na.rm=TRUE, probs=0.5) )
  )
 )

【讨论】：