在 nsga2 R 中设置多目标函数的约束答案

【问题标题】：Setting constraints of a multi-objective function in nsga2 R在 nsga2 R 中设置多目标函数的约束
【发布时间】：2016-12-16 01:31:12
【问题描述】：

我想使用库mco 中的函数nsga2 来解决多目标问题并找到Pareto Frontier，但我无法正确设置约束。

目标函数如下。问题的上下文是项目选择，即我有五个项目，分别用 x[1], x[2], ... x[5] 表示，只能选择一些。例如，如果选择了 1 号项目，则如果未选择 x[1]=0，则 x[1]=1，并且对于所有项目都是如此（x[n] 的值是离散的，1 或 0）。我的另一个限制是所选项目的总预算应小于 100。运行nsga2 函数后，Solution 中的参数似乎不正确，因为参数不是 1 或 0。做我的约束错了吗？如何找到 x[1] 到 x[5] 的最佳值？谢谢！

# objective functions to minimize
 ObjFun <- function (x){
   f1 <- -0.02*x[1] + 0.01*x[2] + 0.02*x[3] + -0.01*x[4] + 0.02*x[5]  
   f2 <- 0.17*x[1] + -0.08*x[2] + 0.10*x[3] + 0.09*x[4] + 0.07*x[5] 
   c(f1, f2)      }

 # The constraints 
 Constr <- function(x){ 
   100 >= 20*x[1] + 30*x[2] + 20*x[3] + 33*x[4] + 60*x[5] # Total budget >= total project costs
   x[1:5] == 1
   x[1:5] == 0      }

 library(mco)
 Solution <- nsga2(ObjFun, 5, 2, lower.bounds=c(0,0,0,0,0), upper.bounds=c(1,1,1,1,1), constraints = Constr)
 # plot(Solution)
 Solution$par

【问题讨论】：

标签： r optimization constraints

【解决方案1】：

由于x[i] 只能为 1 或 0，因此您正在处理一个组合优化问题，其中您必须优化的空间是离散的：

https://en.wikipedia.org/wiki/Combinatorial_optimization

一般来说，数值优化例程被构造为在连续空间（R^n 的子集）上工作。但是，在您的情况下，离散空间很小，并且问题适用于简单的蛮力方法，您可以在所有 32 个可能点上评估 ObjFunc。帕累托边界在这里也是离散的。

## objective functions to minimize
ObjFun <- function (x){
    f1 <- -0.02*x[1] + 0.01*x[2] + 0.02*x[3] + -0.01*x[4] + 0.02*x[5]  
    f2 <- 0.17*x[1] + -0.08*x[2] + 0.10*x[3] + 0.09*x[4] + 0.07*x[5] 
    c(f1=f1, f2=f2)
}

## space of all 32 feasible solutions 
space <- expand.grid(data.frame(matrix(0:1, nrow=2, ncol=5)))
## brute force evaluation of ObjFun on all the 32 feasible solutions
val <- sapply(data.frame(t(space)),  ObjFun)
tmp <- sol <- cbind(space, t(val))

## returns indices of all rows which are Pareto dominated
## by the i-th row
which.are.dominated <- function(i, tmp){
    s1 <- tmp$f1[i]
    s2 <- tmp$f2[i]
    with(tmp,
         which( (s1 <= f1) &
                (s2 <= f2) &
                ( (s1 < f1) |
                  (s2 < f2) )
               ))
}
## For each feasible solution i, remove all feasible solutions which are Pareto dominated by feasible solutions i
i <- 1
repeat{
    remove <- which.are.dominated(i, tmp)
    if(length(remove)>0) tmp <- tmp[-remove, ]
    if(i>=nrow(tmp)) break
    i <- i+1
}
with(sol, plot(f1, f2))
points(tmp$f1, tmp$f2, pch=20, col=2)
legend("topright", col=2, pch=20, "Pareto frontier")

参考资料：

https://en.wikipedia.org/wiki/Multi-objective_optimization

https://en.wikipedia.org/wiki/Pareto_efficiency

附：自从我几年前开始使用 R 以来，我可能第一次使用repeat 语句......

编辑：非暴力方法是使用 nsga2 :D 当我设置它时，在 n 维立方体 [0,1]^n 中搜索 x 的解决方案，其中 n 是项目数；该算法产生许多解决方案（在我的示例中为 200 个），然后您可以使用 round 将其“离散化”为 0 或 1。对于大量项目，要获得更精确的帕累托边界近似值，您必须使用更多代（例如 600 代）。在最终绘图中，如果考虑的项目超过 12 个，则仅绘制成本样本。

##n.projects <- 12
n.projects <- 50
if(n.projects>25) generations=600


set.seed(1)
vecf1 <- rnorm(n.projects)
vecf2 <- rnorm(n.projects)
vcost <- rnorm(n.projects)
n.solutions <- 200

library(mco)

ObjFun <- function (x){
    f1 <- sum(vecf1*x)
    f2 <- sum(vecf2*x)
    c(f1=f1, f2=f2)
}
Constr <- function(x){ 
    c(100 - sum(vcost*x)) # Total budget >= total project costs
}

Solution <- nsga2(ObjFun, n.projects, 2,
                  lower.bounds=rep(0,n.projects), upper.bounds=rep(1,n.projects),
                  popsize=n.solutions,  constraints = Constr, cdim=1,
                  generations=generations)

selected.project.combinations <- unique(round(Solution$par))
selected.project.combinations.costs <- sapply(data.frame(t(selected.project.combinations)),  ObjFun)

## final plotting of results
max.n.proj.plot <- 12
if(n.projects <= max.n.proj.plot){
    xsamp <- expand.grid(data.frame(matrix(0:1, nrow=2, ncol=n.projects)))
}else{
    xsamp <- matrix(sample(0:1, n.projects*2^max.n.proj.plot, replace=TRUE), ncol=n.projects)
}
fsamp <- sapply(data.frame(t(xsamp)), ObjFun)

par(mfrow=c(1,2))
plot(Solution)
points(fsamp[1, ], fsamp[2, ])
points(t(selected.project.combinations.costs), col=3, pch=20)
legend("bottomleft", bty="n", pch=c(20,1), col=c(3,1),
       c("Costs of optimal\nproject combinations",
         "Costs of discarded\nproject combinations"),
       y.intersp=1.8
       )

plot(t(fsamp), xlim=range(Solution$value[ ,1], fsamp[1, ]),
     ylim=range(Solution$value[ ,2], fsamp[2, ]))
points(Solution$value, col=2, pch=".")
points(t(selected.project.combinations.costs), col=3, pch=20)

【讨论】：

您使用蛮力提供的解决方案非常适合发布的示例，因为它只有 5 个项目。我在 50 个项目中尝试了你的方法，它的计算成本很高，并且会产生大量数据帧。可以使用一些遗传算法或投资组合优化技术来节省时间和空间吗？谢谢！
@renato x[n] 的值应该是二进制的。你认为将 nsga2 返回的最优 x[n] 值四舍五入到 0 或 1 是使它们二进制的好方法吗？谢谢！
“好”到底是什么意思？ 1.函数nsga2返回超立方体中的点（其坐标一般不为0或1）； 2. 应用round将使所有坐标等于0或1；您可以通过all(selected.project.combinations %in% 0:1) 进行检查； 3. 对rounded 点的目视检查确认它们的成本接近于近似点的成本（因此它们是近似最优的）。你还需要什么？你担心什么？