【问题标题】:How to combine specific data across multiple rows in a dataframe in R如何在R中的数据框中跨多行组合特定数据
【发布时间】:2020-01-08 16:25:35
【问题描述】:

我希望通过组合 1 列中的行数据单元格(其中该行中的其他列相同)来更改(连接、重塑我不确定哪个词适合这种情况)我的数据框中的数据。

基本上,我有这样的东西:

    >df
    >Person_id     System_id    Category    Type    Tag
    >1A            134          1            Chr     Question
    >1A            134          1            Chr     Answer
    >1A            134          1            Chr     Evaluation
    >1A            134          1            Chr     Overall
    >1A            134          1            Chr     Analysis
    >Z4            002          1            Chr     Question
    >Z4            002          1            Chr     Answer

让它看起来像这样:

    >Person_id     System_id    Category    Type    Tag
    >1A            134          1            Chr     Question, Answer, Evaluation, Overall, Analysis
    >Z4            002          1            Chr     Question, Answer

标签不必用逗号分隔,空格即可。 在哪里寻找这样的解决方案的任何想法都会有所帮助。

谢谢。

【问题讨论】:

标签: r dataframe dplyr reshape


【解决方案1】:

我们可以按前四列和paste 'Tag' 元素组合在一起

library(dplyr)
df %>%
   group_by_at(1:4) %>%
   summarise(Tag = toString(Tag))
# A tibble: 2 x 5
# Groups:   Person_id, System_id, Category [2]
#  Person_id System_id Category Type  Tag                                            
#  <chr>         <int>    <int> <chr> <chr>                                          
#1 1A              134        1 Chr   Question, Answer, Evaluation, Overall, Analysis
#2 Z4                2        1 Chr   Question, Answer    

或使用base R

aggregate(Tag ~ ., df, toString)

注意:toStringpaste(., collapse=", ") 的便捷包装器

数据

df <- structure(list(Person_id = c("1A", "1A", "1A", "1A", "1A", "Z4", 
"Z4"), System_id = c(134L, 134L, 134L, 134L, 134L, 2L, 2L), Category = c(1L, 
1L, 1L, 1L, 1L, 1L, 1L), Type = c("Chr", "Chr", "Chr", "Chr", 
"Chr", "Chr", "Chr"), Tag = c("Question", "Answer", "Evaluation", 
"Overall", "Analysis", "Question", "Answer")), 
 class = "data.frame", row.names = c(NA, 
-7L))

【讨论】:

    【解决方案2】:

    您可以使用paste0collapse = ", " 来实现:

    library(dplyr)
        df %>%
          group_by(Person_id, System_id, Category, Type) %>%
          summarise(Tag = paste0(Tag, collapse = ", "))
    
    #Person_id System_id Category Type  Tag                                            
    #  <chr>         <int>    <int> <chr> <chr>                                          
    #1 1A              134        1 Chr   Question, Answer, Evaluation, Overall, Analysis
    #2 Z4                2        1 Chr   Question, Answer
    

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 2023-03-09
      • 2020-02-17
      • 1970-01-01
      • 1970-01-01
      • 2021-11-24
      • 2019-02-21
      • 1970-01-01
      • 2022-12-01
      相关资源
      最近更新 更多