【问题标题】:Using rep inside sapply to strech a vector according to another vector在 sapply 中使用 rep 根据另一个向量拉伸一个向量
【发布时间】:2018-10-15 12:42:05
【问题描述】:

我想生成一个data.frame 的边。当许多边在一个节点上结束时,就会出现问题。边在向量fromto 中定义。

# Data
vertices <- data.frame(id = 1:3, label = c("a", "b", "c"), stringsAsFactors = FALSE)
to <- c("a", "b", "c")
from1 <- c("c", "a", "b")
from2 <- c("c", "a", "a,b,c")

我尝试了什么:

# Attempt 1
create_edges_1 <- function(from, to) {
  to <- sapply(to, function(x){vertices$id[vertices$label == x]})
  from <- sapply(from, function(x){vertices$id[vertices$label == x]})
  data.frame(from = from, to = to, stringsAsFactors = FALSE)
}

这适用于例如create_edges_1(from1, to),输出为:

  from to
c    3  1
a    1  2
b    2  3

但是,例如from2,此尝试失败。

所以我尝试了以下方法:

# Attempt 2
create_edges_2 <- function(from, to) {
  to <- sapply(unlist(sapply(strsplit(to, ","), function(x){vertices$id[vertices$label == x]})), function(x){rep(x, sapply(strsplit(from2, ","), length))})
  from <- unlist(sapply(strsplit(from2, ","), function(x){vertices$id[vertices$label == x]}))
  data.frame(from = from, to = to, stringsAsFactors = FALSE)
}

这个想法是为不止一条边结束的每个节点“拉伸”to。但是create_edges_2(from1, to)create_edges_2(from2, to) 都会抛出错误

rep(x, sapply(strsplit(from2, ","), length)) 中的错误: 'times' 参数无效

我在sapply 语句中做错了什么?

create_edges_2(from2, to) 的预期输出是:

  from to
     3  1
     1  2
     1  3
     2  3
     3  3

【问题讨论】:

    标签: r apply sapply


    【解决方案1】:

    您可以为此使用连接或match

    f2 <- strsplit(from2, ',')
    
    df <- data.frame(from = unlist(f2)
                     , to = rep(to, lengths(f2))
                     , stringsAsFactors = FALSE)
    

    match

    library(tidyverse)
    
    map_dfc(df, ~ with(vertices, id[match(.x, label)]))
    
    # # A tibble: 5 x 2
    #    from    to
    #   <int> <int>
    # 1     3     1
    # 2     1     2
    # 3     1     3
    # 4     2     3
    # 5     3     3
    

    有连接

    library(dplyr)
    
    df %>% 
      inner_join(vertices, by = c(from = 'label')) %>% 
      inner_join(vertices, by = c(to = 'label')) %>% 
      select_at(vars(matches('.x|.y')))
    
    #   id.x id.y
    # 1    3    1
    # 2    1    2
    # 3    1    3
    # 4    2    3
    # 5    3    3
    

    【讨论】:

    • 感谢您将我指向lengths 函数。 rep(to, lengths(from)) 部分是我脑海中缺失的环节。 ;-)
    【解决方案2】:

    这是一种方法:

    # Attempt 3
    library(dplyr)
    to <- sapply(to, function(x){vertices$id[vertices$label == x]})
    from0 <- sapply(from2, function(x) strsplit(x, ",")) %>% unlist() %>% as.character()
    lengths0 <- lapply(sapply(from2, function(x) strsplit(x, ",")), length) %>% unlist()
    
    to0 <- c()
    for( i in 1:length(lengths0)) to0 <- c(to0, rep(to[i], lengths0[i]))
    
    from <- sapply(from0, function(x){vertices$id[vertices$label == x]})
    edges <- data.frame(from = from, to = to0, stringsAsFactors = FALSE)
    edges
    

    根据要求给出这个结果:

      from to
    1    3  1
    2    1  2
    3    1  3
    4    2  3
    5    3  3
    

    这个想法是用逗号分隔符分割from,并存储每个元素的大小以便“拉伸”每个节点。这里使用for 循环完成

    【讨论】:

      猜你喜欢
      • 2016-03-24
      • 1970-01-01
      • 1970-01-01
      • 2012-03-13
      • 2011-12-27
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 2022-06-28
      相关资源
      最近更新 更多