【问题标题】:summing acrossing rows for certain columns, keep NA if all NA对某些列的跨行求和,如果全部为 NA,则保留 NA
【发布时间】:2020-10-30 12:58:40
【问题描述】:

我有看起来像这样的临床数据...我有一堆不同的二元结果,但我只想将其中的几个结果相加以创建总结果/综合得分。我的数据看起来像这样

``patientid <- c(100,101,102,103,104,105,106)
outcome1 <- c(0,NA,1,0,1,NA,1)
outcome2 <- c(0,1,1,0,0,NA,1) 
outcome3 <- c(0,NA,NA,0,1,NA,0)
outcome4 <- c(NA,NA,NA,0,1,NA,0)
Data<-data.frame(patientid=patientid,outcome1=outcome1,outcome2=outcome2,outcome3=outcome3,outcome4=outcome4)
Data''

现在我想为其中三个结果创建一个综合得分。 NA 应该算作零,除非在选择求和的每个结果中都是 NA,在这种情况下它将保持 NA。我假设这是用rowsums完成的?这是我想要的数据库应该是什么样的(仅总结结果 1、2、4)

``patientid <- c(100,101,102,103,104,105,106)
  outcome1 <- c(0,NA,1,0,1,NA,1)
  outcome2 <- c(0,1,1,0,1,NA,1) 
  outcome3 <- c(0,NA,NA,0,1,NA,0)
  outcome4 <- c(NA,NA,NA,0,1,NA,0)
  composite <- c(0,1,2,0,3,NA,2)
 data.frame(patientid=patientid,outcome1=outcome1,outcome2=outcome2,outcome3=outcome3,outcome4=outcome4, composite= composite)
    Data''

【问题讨论】:

    标签: r data-cleaning


    【解决方案1】:

    在基础 R 中,您可以使用 rowSums

    #select the columns that we want to count
    cols <- paste0('outcome', c(1:2, 4))
    #sum them rowwise
    Data$composite <- rowSums(Data[cols], na.rm  =TRUE)
    #turn all NA rows to NA.
    Data$composite[rowSums(!is.na(Data[cols])) == 0] <- NA
    Data
    
    #  patientid outcome1 outcome2 outcome3 outcome4 composite
    #1       100        0        0        0       NA         0
    #2       101       NA        1       NA       NA         1
    #3       102        1        1       NA       NA         2
    #4       103        0        0        0        0         0
    #5       104        1        0        1        1         2
    #6       105       NA       NA       NA       NA        NA
    #7       106        1        1        0        0         2
    

    【讨论】:

      【解决方案2】:

      使用c_across() 尝试这种方法。我对为什么最终输出的某些列与原始输出不同感到有些困惑。您可以使用c_across()rowwise() 对某些行求和,然后标记所有为“NA”的行。代码如下:

      library(tidyverse)
      #Code
      NewData <- Data %>% rowwise(patientid) %>% 
        mutate(Composite=sum(c_across(c(outcome1,outcome2,outcome4)),na.rm=T)) %>%
        mutate(Flag=ifelse(sum(!is.na(c_across(c(outcome1,outcome2,outcome4))))==0,1,0),
               Composite=ifelse(Flag==1,NA,Composite)) %>% select(-Flag)
      

      输出:

      # A tibble: 7 x 6
      # Rowwise:  patientid
        patientid outcome1 outcome2 outcome3 outcome4 Composite
            <dbl>    <dbl>    <dbl>    <dbl>    <dbl>     <dbl>
      1       100        0        0        0       NA         0
      2       101       NA        1       NA       NA         1
      3       102        1        1       NA       NA         2
      4       103        0        0        0        0         0
      5       104        1        0        1        1         2
      6       105       NA       NA       NA       NA        NA
      7       106        1        1        0        0         2
      

      【讨论】:

        【解决方案3】:
        library(tidyverse)
        
        Data %>%
          rowwise() %>%
          mutate(
            Composite = if_else(
              c(outcome1, outcome2, outcome4) %>% is.na() %>% mean() %>% `==`(1), # looking for cases where all columns are NA
              NA_real_, # all NA columns produce NA
              c(outcome1, outcome2, outcome4) %>% sum(na.rm = T) # for other columns, NAs are treated as 0s
              )
          )
        
        #  patientid outcome1 outcome2 outcome3 outcome4 composite
        #1       100        0        0        0       NA         0
        #2       101       NA        1       NA       NA         1
        #3       102        1        1       NA       NA         2
        #4       103        0        0        0        0         0
        #5       104        1        0        1        1         2
        #6       105       NA       NA       NA       NA        NA
        #7       106        1        1        0        0         2
        

        【讨论】:

          猜你喜欢
          • 1970-01-01
          • 1970-01-01
          • 2021-12-19
          • 2015-08-08
          • 2020-08-12
          • 1970-01-01
          • 1970-01-01
          • 1970-01-01
          • 1970-01-01
          相关资源
          最近更新 更多