【问题标题】:Filtering for minimum values in multiple columns in R过滤R中多列中的最小值
【发布时间】:2020-10-08 15:31:37
【问题描述】:

如果这个答案的格式不正确,请提前抱歉,我对 R 和 SO 社区很陌生,我欢迎建设性的批评。我有一个看起来像这样的数据框,并且正在尝试对其进行过滤,使其仅包含每个人的最少“汽车”和“房屋”。

my_data = data.frame("Name" = c("Dora", "Dora", "John", "John", "Marie", "Marie"), 
"Cars" = c(2, 3, NA, NA, 4, 1), 
"Houses" = c(NA, NA, 4, 3, 2, NA))
#Name   Cars   Houses
#1  Dora    2     NA
#2  Dora    3     NA
#3  John   NA      4
#4  John   NA      3
#5 Marie    4     2
#6 Marie    1     NA

我想得到这样的结果(特别注意玛丽行已经改变,但如果它也分成两行也可以):

#Name   Cars   Houses
#Dora    2     NA
#John   NA     3
#Marie   1     2

或者像这样:

#Name   Cars   Houses
#Dora    2     NA
#John   NA      3
#Marie   NA     2
#Marie    1     NA

根据其他答案,我已经尝试过

my_data %>%
group_by(Name) %>%
filter(Cars == min(Cars))
#Name   Cars    Houses
#Dora   2       NA
#Marie  1       NA

但这会导致 John 行在我过滤最小房屋之前被删除。有没有人对如何解决这个问题有任何建议?提前致谢。

【问题讨论】:

    标签: r data-cleaning data-wrangling


    【解决方案1】:

    我们可以使用summarise 来获取每个名称的每一列的最小值:

    my_data = data.frame("Name" = c("Dora", "Dora", "John", "John", "Marie", "Marie"), 
    "Cars" = c(2, 3, NA, NA, 4, 1), 
    "Houses" = c(NA, NA, 4, 3, 2, NA))
    
    library(dplyr)
    my_data %>% 
      group_by(Name) %>% 
      summarise(Cars = min(Cars, na.rm = TRUE),
                Houses = min(Houses, na.rm = TRUE))
    
    `summarise()` ungrouping output (override with `.groups` argument)
    # A tibble: 3 x 3
      Name   Cars Houses
      <chr> <dbl>  <dbl>
    1 Dora      2    Inf
    2 John    Inf      3
    3 Marie     1      2
    

    【讨论】:

      【解决方案2】:

      这是您可以在基础 R 中执行的操作:

      df <- data.frame("Name" = c("Dora", "Dora", "John", "John", "Marie", "Marie"), 
                           "Cars" = c(2, 3, NA, NA, 4, 1), 
                           "Houses" = c(NA, NA, 4, 3, 2, NA), stringsAsFactors = FALSE)
      
      aggregate(df, list(df$Name), FUN = function(x) min(x, na.rm = TRUE))[,-1]
      

      输出

         Name Cars Houses
      1  Dora    2     Inf
      2  John   Inf      3
      3 Marie    1       2
      

      【讨论】:

        猜你喜欢
        • 1970-01-01
        • 1970-01-01
        • 1970-01-01
        • 2020-11-06
        • 1970-01-01
        • 1970-01-01
        • 1970-01-01
        • 1970-01-01
        • 2021-07-27
        相关资源
        最近更新 更多