【问题标题】:Conditionally mutate multiple columns in R有条件地改变 R 中的多列
【发布时间】:2020-04-24 00:37:58
【问题描述】:

我有一个带有 j 个级别的因子列以及长度为 kj 个向量的数据框。我想用后一个向量中的值填充前一个数据框中的 k 列,条件是因子。

简化示例(三个级别,三个向量,两个值):

df1 <- data.frame("Factor" = rep(c("A", "B", "C"), times = 5))
vecA <- c(1, 2)
vecB <- c(2, 1)
vecC <- c(3, 3)

这是使用嵌套 ifelse 语句的解决方案:

library(tidyverse)
df1 %>%
  mutate(V1 = ifelse(Factor == "A", vecA[1], 
                     ifelse(Factor == "B", vecB[1], vecC[1])),
         V2 = ifelse(Factor == "A", vecA[2], 
                     ifelse(Factor == "B", vecB[2], vecC[2])))

我想避免嵌套的 ifelse 语句。理想情况下,我还想避免单独改变每一列。

【问题讨论】:

    标签: r if-statement dplyr


    【解决方案1】:

    这是一个想法。在全局环境中,获取所有以“vec”开头的对象,由mget()完成。这将创建一个列表。对于列表中的每个元素,在其间粘贴带有“_”的数字。然后,在向量中为以下连接过程排列名称。加入后,拆分列,值为cSplit()。我希望这种方法适用于您的实际情况。

    library(tidyverse)
    library(splitstackshape)
    
    # Create a character vector.
    mychr <- map_chr(.x = mget(ls(pattern = "vec")),
                     .f = function(x) {paste0(x, collapse = "_")})
    
    # Remove "vec" in names.
    names(mychr) <- sub(x = names(mychr), pattern = "vec", replacement = "")
    
    #   A     B     C 
    #"1_2" "2_1" "3_3"
    
    # stack() creates a data frame. Use it in left_join().
    # Then, split the column, values into two columns. You probably have more than
    # two. So I decided to use cSplit() here.
    
    left_join(df1, stack(mychr), by = c("Factor" = "ind")) %>%
    cSplit(splitCols = "values", sep = "_", direction = "wide", type.convert = FALSE)
    
    #    Factor values_1 values_2
    # 1:      A        1        2
    # 2:      B        2        1
    # 3:      C        3        3
    # 4:      A        1        2
    # 5:      B        2        1
    # 6:      C        3        3
    # 7:      A        1        2
    # 8:      B        2        1
    # 9:      C        3        3
    #10:      A        1        2
    #11:      B        2        1
    #12:      C        3        3
    #13:      A        1        2
    #14:      B        2        1
    #15:      C        3        3
    

    【讨论】:

      【解决方案2】:

      这是一个base R 选项

      df1[c('V1', 'V2')] <- do.call(Map, c(f = c, mget(ls(pattern="^vec[A-C]$"))))
      df1
      #    Factor V1 V2
      #1       A  1  2
      #2       B  2  1
      #3       C  3  3
      #4       A  1  2
      #5       B  2  1
      #6       C  3  3
      #7       A  1  2
      #8       B  2  1
      #9       C  3  3
      #10      A  1  2
      #11      B  2  1
      #12      C  3  3
      #13      A  1  2
      #14      B  2  1
      #15      C  3  3
      

      或与transpose 来自purrr

      library(dplyr)
      library(purrr)
      mget(ls(pattern="^vec[A-C]$")) %>% 
           transpose %>% 
           setNames(c('V1', 'V2')) %>% 
           cbind(df1, .)
      

      【讨论】:

        【解决方案3】:

        这是一种方法:

        # modify the vectors
        l <- list('A' = vecA, 'B' = vecB, 'C' = vecC)
        
        # create df with mapping
        df2 = data.frame(t(sapply(df1$Factor, function(x) l[[x]])))
        colnames(df2) <- c('V1', 'V2')
        
        new_df = cbind(df1, df2)
        
           Factor V1 V2
        1       A  1  2
        2       B  2  1
        3       C  3  3
        4       A  1  2
        5       B  2  1
        6       C  3  3
        7       A  1  2
        8       B  2  1
        9       C  3  3
        10      A  1  2
        11      B  2  1
        12      C  3  3
        13      A  1  2
        14      B  2  1
        15      C  3  3
        

        【讨论】:

          猜你喜欢
          • 1970-01-01
          • 2021-10-15
          • 2018-11-04
          • 1970-01-01
          • 2021-07-03
          • 2021-01-28
          • 2014-11-25
          • 2018-01-01
          • 2018-07-02
          相关资源
          最近更新 更多