【问题标题】:mixed combinations / permutations from different sets来自不同集合的混合组合/排列
【发布时间】:2018-10-06 18:45:22
【问题描述】:

本问答的动机是How to build permutation with some conditions in R

到目前为止,已经有一些很好的 R 包,例如 RcppAlgosarrangements在单个集合上提供了有效的组合/排列。例如,如果我们要从letters[1:6] 中选择 3 个项目,以下给出所有组合:

library(RcppAlgos)
comboGeneral(letters[1:6], 3)
#      [,1] [,2] [,3]
# [1,] "a"  "b"  "c" 
# [2,] "a"  "b"  "d" 
# [3,] "a"  "b"  "e" 
# [4,] "a"  "b"  "f" 
# [5,] "a"  "c"  "d" 
# [6,] "a"  "c"  "e" 
# [7,] "a"  "c"  "f" 
# [8,] "a"  "d"  "e" 
# [9,] "a"  "d"  "f" 
#[10,] "a"  "e"  "f" 
#[11,] "b"  "c"  "d" 
#[12,] "b"  "c"  "e" 
#[13,] "b"  "c"  "f" 
#[14,] "b"  "d"  "e" 
#[15,] "b"  "d"  "f" 
#[16,] "b"  "e"  "f" 
#[17,] "c"  "d"  "e" 
#[18,] "c"  "d"  "f" 
#[19,] "c"  "e"  "f" 
#[20,] "d"  "e"  "f" 

但是,如果我们想要更复杂的东西,比如

  • LETTERS[1:2]中选择1项
  • letters[1:6]中选择3项
  • as.character(1:3)中选择2项

如何生成所有组合以及可选的所有排列?

【问题讨论】:

    标签: r function combinations permutation


    【解决方案1】:

    假设我们有一个集合列表set_list,其中k[i] 项是从set_list[[i]] 中选择的,那么从数学上讲,我们将这样解决问题:

    1. 为每组生成所有组合;
    2. 合并所有集合的组合;
    3. 为每个组合创建所有排列。

    下面的函数MixedCombnPerm是我的实现,第1步和第3步使用RcppAlgos。目前第2步没有使用最优算法。这是一种“残酷的力量”,依赖于expand.grid 和后续rbind 的更快实现。我知道一种更快的递归方法(例如用于在mgcv 中形成张量积模型矩阵的方法),它可以在 Rcpp 中编码,但由于时间原因我现在不会这样做。

    library(RcppAlgos)
    
    MixedCombnPerm <- function (set_list, k, perm = FALSE) {
    
      ###################
      ## mode checking ##
      ###################
    
      if (!all(vapply(set_list, is.vector, TRUE)))
        stop("All sets must be 'vectors'!")
    
      if (length(unique(vapply(set_list, mode, ""))) > 1L)
        stop("Please ensure that all sets have the same mode!")
    
      ################
      ## basic math ##
      ################
    
      ## size of each sets
      n <- lengths(set_list, FALSE)
      ## input validation
      if (length(n) != length(k)) stop("length of 'k' different from number of sets!")
      if (any(k > n)) stop("can't choose more items than set size!")
      ## number of sets
      n_sets <- length(n)
      ## total number of items
      n_items <- sum(k)
      ## number of combinations
      n_combinations_by_set <- choose(n, k)
      n_combinations <- prod(n_combinations_by_set)
    
      #################################
      ## step 1: combinations by set ##
      #################################
    
      ## generate `n_combinations[i]` combinations on set i
      combinations_by_set <- vector("list", n_sets)
      for (i in seq_len(n_sets)) {
        ## each column of combinations_by_set[[i]] is a record
        combinations_by_set[[i]] <- t.default(comboGeneral(set_list[[i]], k[i]))
        }
    
      ################################
      ## step 2: merge combinations ##
      ################################
    
      ## merge combinations from all sets
      ## slow_expand_grid <- function (m) expand.grid(lapply(m, seq_len))
      fast_expand_grid <- function (m) {
        n_sets <- length(m)      ## number of sets
        mm <- c(1L, cumprod(m))  ## cumulative leading dimension
        grid_size <- mm[n_sets + 1L]  ## size of the grid
        grid_ind <- vector("list", n_sets)
        for (i in seq_len(n_sets)) {
          ## grid_ind[[i]] <- rep_len(rep(seq_len(m[i]), each = mm[i]), M)
          grid_ind[[i]] <- rep_len(rep.int(seq_len(m[i]), rep.int(mm[i], m[i])), grid_size)
          }
        grid_ind
        }
      grid_ind <- fast_expand_grid(n_combinations_by_set)
    
      ## each column is a record
      combinations_grid <- mapply(function (x, j) x[, j, drop = FALSE],
                           combinations_by_set, grid_ind,
                           SIMPLIFY = FALSE, USE.NAMES = FALSE)
      all_combinations <- do.call("rbind", combinations_grid)
    
      ########################################################
      ## step 3: generate permutations for each combination ##
      ########################################################
    
      if (!perm) return(all_combinations)
      else {
        ## generate `factorial(n_items)` permutations for each combination
        all_permutations <- vector("list", n_combinations)
        for (i in seq_len(n_combinations)) {
          all_permutations[[i]] <- permuteGeneral(all_combinations[, i], n_items)
          }
        return(all_permutations)
        }
    
      }
    

    该函数会进行严格的输入检查。用户应确保所有集合都以“向量”形式给出,并且它们具有相同的模式。所以对于问题中的例子,我们应该提供:

    ## note the "as.character(1:3)"
    set_list <- list(LETTERS[1:2], letters[1:6], as.character(1:3))
    k <- c(1, 3, 2)
    

    如果参数perm = FALSE(默认),该函数返回矩阵中的组合(每列是一条记录)。否则,它会返回一个矩阵列表,每个矩阵给出特定组合的排列(每行是一条记录)。

    试试这个例子:

    combinations <- MixedCombnPerm(set_list, k)
    permutations <- MixedCombnPerm(set_list, k, TRUE)
    

    检查结果:

    combinations[, 1:6]
    #     [,1] [,2] [,3] [,4] [,5] [,6]
    #[1,] "A"  "B"  "A"  "B"  "A"  "B" 
    #[2,] "a"  "a"  "a"  "a"  "a"  "a" 
    #[3,] "b"  "b"  "b"  "b"  "b"  "b" 
    #[4,] "c"  "c"  "d"  "d"  "e"  "e" 
    #[5,] "1"  "1"  "1"  "1"  "1"  "1" 
    #[6,] "2"  "2"  "2"  "2"  "2"  "2" 
    
    permutations[[1]][1:6, ]
    #     [,1] [,2] [,3] [,4] [,5] [,6]
    #[1,] "A"  "a"  "b"  "c"  "1"  "2" 
    #[2,] "A"  "a"  "b"  "c"  "2"  "1" 
    #[3,] "A"  "a"  "b"  "1"  "c"  "2" 
    #[4,] "A"  "a"  "b"  "1"  "2"  "c" 
    #[5,] "A"  "a"  "b"  "2"  "c"  "1" 
    #[6,] "A"  "a"  "b"  "2"  "1"  "c" 
    

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 2023-03-23
      • 1970-01-01
      • 2018-01-07
      • 2012-08-27
      • 2012-03-15
      • 1970-01-01
      • 2021-01-17
      • 1970-01-01
      相关资源
      最近更新 更多