有条件地在 R 中采样答案

【问题标题】：conditionally sample in R有条件地在 R 中采样
【发布时间】：2018-02-07 15:51:09
【问题描述】：

我在另一篇文章中找到了this function，它在调用时按顺序输出向量组合。当存在大量具有许多元素的向量时，它本质上是对expand.grid 的一种解决方法。

函数如下：

lazyExpandGrid <- function(...) {
  dots <- list(...)
  argnames <- names(dots)
  if (is.null(argnames)) argnames <- paste0('Var', seq_along(dots))
  sizes <- lengths(dots)
  indices <- cumprod(c(1L, sizes))
  maxcount <- indices[ length(indices) ]
  i <- 0
  function(index) {
    i <<- if (missing(index)) (i + 1L) else index
    if (length(i) > 1L) return(do.call(rbind.data.frame, lapply(i, sys.function(0))))
    if (i > maxcount || i < 1L) return(FALSE)
    setNames(Map(`[[`, dots, (i - 1L) %% indices[-1L] %/% indices[-length(indices)] + 1L  ),
             argnames)
  }
}

以下是一些示例调用：

set.seed(42)
nxt <- lazyExpandGrid(a=1:1e2, b=1:1e2, c=1:1e2, d=1:1e2, e=1:1e2, f=1:1e2)
as.data.frame(nxt()) # prints the 1st possible combination
nxt(sample(1e2^6, size=7)) # prints 7 sampled rows from the sample space

我不知道如何使用lazyExpandGrid2 进行有条件的采样。如果样本有一定数量的元素，我想排除它们。

例如，假设我有这些向量，我想为其创建独特的组合：a=0:3, b=0:4, c=0:5。我可以使用：nxt(sample(50, size=50, replace = F)) 创建示例。

但是假设我对有两个 0 的样本不感兴趣。我怎么能排除这些样本？我试过这样的事情：nxt(sample(which(!(sum(as.data.frame(nxt()) == 0)==2)), size=50, replace = F))。

我只是不明白如何引用sample() 中的采样行以便能够在它不符合特定条件时将其排除。

【问题讨论】：

您必须预先计算与排除标准不匹配的有效索引。或者，您可以在调用 nxt 后丢弃不适合的样本。
所以唯一的方法是预先计算或类似的东西：nxt
第二种方法似乎效率低下，因为我不一定需要满足标准的所有可能组合，并且对于具有许多元素的许多向量，可能的组合会变得很大。有没有办法将示例与上述代码结合起来？它需要在不更换的情况下进行采样。

标签： r function combinations sample

【解决方案1】：

如果您想删除不符合条件的行，我认为您无需担心没有替换的采样，因为将相同的值传递给nxt 应该会生成相同的行，但仍会被删除.然后，它可能会为您在上面定义的函数创建一个包装器，如果它不满足您所追求的条件，则它不包含nxt-generated 行。此处，如果零的数量等于 2，则删除该行：

set.seed(0123)

nxt <- lazyExpandGrid(a = 0:3, b = 0:4, c = 0:5)

nxtDrop <- function(samp, n_row){
  t(sapply(1:n_row, function(x) {
    y = nxt(sample(samp, 1))
    while (length(grep(0, y)) == 2) {
      y = nxt(sample(samp, 1))
    }
    return(y)
  }))
}

> nxtDrop(120, 10)
      a b c
 [1,] 2 3 1
 [2,] 2 3 4
 [3,] 1 2 2
 [4,] 1 1 5
 [5,] 0 3 5
 [6,] 1 1 0
 [7,] 3 0 3
 [8,] 3 1 5
 [9,] 2 1 3
[10,] 2 3 2

【讨论】：