在 R 中对具有相似名称/前缀的列求和答案

【问题标题】：Sum columns with similar names/prefixes in R在 R 中对具有相似名称/前缀的列求和
【发布时间】：2021-08-29 06:24:21
【问题描述】：

data <- data.frame (a_1 = 1:5, a_1.1 = 3:7, a_1.2 = 5:9, b_1 = 4:8, b_1.1= 7:11)

我想拥有a_1 = a_1 + a_1.1 + a_1.2b_1 = b_1 + b_1.1，同时仍将列保留为a_1, a_1.1, a_1.2, b_1, b_1.1

我有几个words_number.number 列，所以我想要一些函数或简洁的代码sn-p，但任何解决方案将不胜感激！

【问题讨论】：

标签： r tidyverse data-wrangling

【解决方案1】：

我们可以在names的子串上使用split和split.default的数据，并在list上得到rowSums

cbind(data, sapply(split.default(data, 
       sub("\\..*", "", names(data))), rowSums))

【讨论】：

【解决方案2】：

这是您可以使用的tidyverse 解决方案：

library(dplyr)
library(stringr)
library(purrr)

names(data) %>%
  str_extract("^[:alpha:]") %>%
  unique() %>%
  map_dfc(~ data %>% select(contains(.x)) %>% 
            reduce(~ ..1 + ..2)) %>%
  set_names(~ letters[seq_along(.)]) %>%
  bind_cols(data) %>%
  relocate(c(a, b), .after = last_col())

# A tibble: 5 x 7
    a_1 a_1.1 a_1.2   b_1 b_1.1     a     b
  <int> <int> <int> <int> <int> <int> <int>
1     1     3     5     4     7     9    11
2     2     4     6     5     8    12    13
3     3     5     7     6     9    15    15
4     4     6     8     7    10    18    17
5     5     7     9     8    11    21    19

【讨论】：

【解决方案3】：

使用tapply 的split.default 的“变体”

list2DF(
  tapply(
    as.list(data),
    gsub("\\..*", "", names(data)),
    function(x) rowSums(list2DF(x))
  )
)

给予

【讨论】：