我知道这个问题已经被视为已解决,但实际上有一个超级优雅的解决方案,只需以下几行代码。这样的计算是精确的,没有任何类型的数值优化。
## target covariance matrix
A <- matrix(c(20.43, -8.59,-8.59, 24.03), nrow = 2)
E <- eigen(A, symmetric = TRUE) ## symmetric eigen decomposition
U <- E[[2]] ## eigen vectors, i.e., rotation matrix
D <- sqrt(E[[1]]) ## root eigen values, i.e., scaling factor
r <- 1.44 ## radius of original circle
Z <- rbind(c(r, 0), c(0, r), c(-r, 0), c(0, -r)) ## original vertices on major / minor axes
Z <- tcrossprod(Z * rep(D, each = 4), U) ## transformed vertices on major / minor axes
# [,1] [,2]
#[1,] -5.055136 6.224212
#[2,] -4.099908 -3.329834
#[3,] 5.055136 -6.224212
#[4,] 4.099908 3.329834
C0 <- c(-0.05, 0.09) ## new centre
Z <- Z + rep(C0, each = 4) ## shift to new centre
# [,1] [,2]
#[1,] -5.105136 6.314212
#[2,] -4.149908 -3.239834
#[3,] 5.005136 -6.134212
#[4,] 4.049908 3.419834
为了解释背后的数学原理,我将采取 3 个步骤:
- 这个椭圆是从哪里来的?
- Cholesky 分解方法及其缺点。
- 特征分解方法及其自然解释。
这个椭圆是从哪里来的?
在实践中,这个椭圆可以通过对单位圆x ^ 2 + y ^ 2 = 1的一些线性变换得到。
Cholesky分解法及其缺点
## initial circle
r <- 1.44
theta <- seq(0, 2 * pi, by = 0.01 * pi)
X <- r * cbind(cos(theta), sin(theta))
## target covariance matrix
A <- matrix(c(20.43, -8.59,-8.59, 24.03), nrow = 2)
R <- chol(A) ## Cholesky decomposition
X1 <- X %*% R ## linear transformation
Z <- rbind(c(r, 0), c(0, r), c(-r, 0), c(0, -r)) ## original vertices on major / minor axes
Z1 <- Z %*% R ## transformed coordinates
## different colour per quadrant
g <- floor(4 * (1:nrow(X) - 1) / nrow(X)) + 1
## draw ellipse
plot(X1, asp = 1, col = g)
points(Z1, cex = 1.5, pch = 21, bg = 5)
## draw circle
points(X, col = g, cex = 0.25)
points(Z, cex = 1.5, pch = 21, bg = 5)
## draw axes
abline(h = 0, lty = 3, col = "gray", lwd = 1.5)
abline(v = 0, lty = 3, col = "gray", lwd = 1.5)
我们看到线性变换矩阵R 似乎没有自然解释。圆的原始顶点不映射到椭圆的顶点。
特征分解法及其自然解释
## initial circle
r <- 1.44
theta <- seq(0, 2 * pi, by = 0.01 * pi)
X <- r * cbind(cos(theta), sin(theta))
## target covariance matrix
A <- matrix(c(20.43, -8.59,-8.59, 24.03), nrow = 2)
E <- eigen(A, symmetric = TRUE) ## symmetric eigen decomposition
U <- E[[2]] ## eigen vectors, i.e., rotation matrix
D <- sqrt(E[[1]]) ## root eigen values, i.e., scaling factor
r <- 1.44 ## radius of original circle
Z <- rbind(c(r, 0), c(0, r), c(-r, 0), c(0, -r)) ## original vertices on major / minor axes
## step 1: re-scaling
X1 <- X * rep(D, each = nrow(X)) ## anisotropic expansion to get an axes-aligned ellipse
Z1 <- Z * rep(D, each = 4L) ## vertices on axes
## step 2: rotation
Z2 <- tcrossprod(Z1, U) ## rotated vertices on major / minor axes
X2 <- tcrossprod(X1, U) ## rotated ellipse
## different colour per quadrant
g <- floor(4 * (1:nrow(X) - 1) / nrow(X)) + 1
## draw rotated ellipse and vertices
plot(X2, asp = 1, col = g)
points(Z2, cex = 1.5, pch = 21, bg = 5)
## draw axes-aligned ellipse and vertices
points(X1, col = g)
points(Z1, cex = 1.5, pch = 21, bg = 5)
## draw original circle
points(X, col = g, cex = 0.25)
points(Z, cex = 1.5, pch = 21, bg = 5)
## draw axes
abline(h = 0, lty = 3, col = "gray", lwd = 1.5)
abline(v = 0, lty = 3, col = "gray", lwd = 1.5)
## draw major / minor axes
segments(Z2[1,1], Z2[1,2], Z2[3,1], Z2[3,2], lty = 2, col = "gray", lwd = 1.5)
segments(Z2[2,1], Z2[2,2], Z2[4,1], Z2[4,2], lty = 2, col = "gray", lwd = 1.5)
在这里我们看到,在变换的两个阶段,顶点仍然映射到顶点。正是基于这样的性质,我们在一开始就给出了简洁的解决方案。