从聚合满足条件的列表中查找 n 个元组答案

【问题标题】：Finding n tuples from a list whose aggregation satisfies a condition从聚合满足条件的列表中查找 n 个元组
【发布时间】：2017-03-28 22:15:19
【问题描述】：

我有一个二元素向量列表。从这个列表中，我想找到 n 个向量（不一定不同）（x，y），这样这些向量的 y 之和大于或等于数字 k。如果有多个向量满足这个条件，选择xs之和最小的那个。

例如，我想找到 n=2 个向量 (x1,y1) 和 (x2,y2) 使得 y1+y2 >= k。如果满足这个条件的不止一个，选择x1+x2最小的那个。

到目前为止，我只设法设置了以下代码：

X <- c(3, 2, 3, 8, 7, 7, 13, 11, 12, 12)
Y <- c(2, 1, 3, 6, 5, 6, 8, 9, 10, 9)

df <- data.frame(A, B)
l <- list()
for (i in seq(1:nrow(df))){
  n <- as.numeric(df[i,])
  l[[i]] <- n
}

使用上面的值，假设 n=1，k=9，那么我将选择元组 (x,y)=(11,9)，因为即使 (12,9) 也符合 y 的条件=k，x更小。

如果 n=2，k=6，那么我会选择 (x1,y1)=(3,3) 和 (x2,y2)=(3,3)，因为它是满足 y1 的最小 x1+x2 +y2 >= 6。

如果 n=2，k=8，那么我会选择 (x1,y1)=(3,3) 和 (x2,y2)=(7,5)，因为 y1+y2>=8 并且在下一个备选元组 (3,3) 和 (8,6)，3+8=11 大于 3+7。

我觉得一个蛮力解决方案是可能的：每个向量与其余向量的所有可能的 n 大小组合，对于每个排列计算 yTotal=y1+y2+y3... 找到所有满足 yTotal 的 yTotal 组合> =k，其中，选择 xTotal=x1+x2+x3... 最小的那个。

我确实很难将它放入 R 代码中，并且想知道它是否是正确的选择。感谢您的帮助！

【问题讨论】：

我相信此类集合的数学术语是“分区”。也许这会加快你搜索的速度。当我运行sos::findFn("partitions") 时，我得到了# found 803 matches;，其中包括一个名为“partitions”的包中的一堆。

标签： r algorithm vector mathematical-optimization combinatorics

【解决方案1】：

首先，从您的问题看来，您允许从 Y 中选择替换。该代码基本上是您的蛮力方法：使用gtools 库中的permutations 生成排列。然后基本上对sum(Y)>=k进行过滤，先按最小sum(Y)排序，再按sum(X)排序。

X <- c(3, 2, 3, 8, 7, 7, 13, 11, 12, 12)
Y <- c(2, 1, 3, 6, 5, 6, 8, 9, 10, 9)
n<-1
perm<-gtools::permutations(n=length(Y),r=n, repeats.allowed=T)
result<-apply(perm,1,function(x){ c(sum(Y[x]),sum(X[x])) })
dim(result) # 2 10

k=9 ## Case of n=1, k=9
keep<-which(result[1,]>=k)
result[,keep[order(result[1,keep],result[2,keep])[1]]] # 9 and 11

##### n=2 cases ##########
n<-2
perm<-gtools::permutations(n=length(Y),r=n, repeats.allowed=T)
result<-apply(perm,1,function(x){ c(sum(Y[x]),sum(X[x])) })
dim(result) # 2 100

## n=2, k=6
keep<-which(result[1,]>=6)
        keep[order(result[1,keep],result[2,keep])[1]]  # the 23 permutation
perm[23,]                                              # 3 3 is (Y1,Y2)
result[,keep[order(result[1,keep],result[2,keep])[1]]] # sum(Y)=6 and sum(X)=6

## n=2, k=8
keep<-which(result[1,]>=8)
        keep[order(result[1,keep],result[2,keep])[1]]  # the 6 permutation
perm[6,]                                               # 1 6 is (Y1,Y2)
result[,keep[order(result[1,keep],result[2,keep])[1]]] # sum(Y)=8 and sum(X)=10

【讨论】：

非常感谢！这非常有效，帮助我慢慢掌握 R 的复杂性。当然，蛮力不能很好地适应更大的源向量，因为排列数组会呈指数级增长。在我的用例中，向量包含大约 600 个项目，因此我必须研究缩小向量的方法，或者可能找到不同的方法。