我通读了 cmets,我想我得到了你想要达到的目标
我创建了一个模拟您的情况的虚拟示例。
library(dplyr)
art_id <- c(11, 11, 11, 10, 10)
author <- c("Ajay","Vijay","Shyam",
"Ajay","Tarun")
uniq_art <- unique(art_id) # get unique article id
所以在这种情况下,Ajay 与三位作者(“Shyam”,
“维杰”和“塔伦”)。
Shyam 和 Vijay 分别与两位作者合作
Tarun 只与一位作者合作过。
我对您的问题的解决方案不是很优雅。
希望有人能提供更优雅的解决方案。
# Make the data frame
publish <- data.frame(art_id, author)
# subset for a particular aritcle ID
# group by author and get the number of authors each author
# has worked with
b <- publish %>% filter(art_id == uniq_art[1])
c <- b %>% group_by(author) %>% summarise(ans = dim(b)[1]-1)
# Repeat the process and join results to above data frame
# for the remaining article IDs
for(i in 2:length(uniq_art)) {
b <- publish %>% filter(art_id == uniq_art[i])
d <- b %>% group_by(author) %>% summarise(ans = dim(b)[1]-1)
c <- full_join(c, d, by = "author")
}
# get the number of columns
nc <- ncol(c)
# sample output after running loop in my dummy case
# A tibble: 4 x 3
author ans.x ans.y
<fctr> <dbl> <dbl>
1 Ajay 2 1
2 Shyam 2 NA
3 Vijay 2 NA
4 Tarun NA 1
# Add all numeric values in each row to get total collaborated authors
total_collab <- rowSums(c[,2:nc], na.rm = T)
final_ans <- c %>% mutate(total = total_collab)
final_ans
# A tibble: 4 x 4
author ans.x ans.y total
<fctr> <dbl> <dbl> <dbl>
1 Ajay 2 1 3
2 Shyam 2 NA 2
3 Vijay 2 NA 2
4 Tarun NA 1 1
希望这会有所帮助。