基于数据子集绘制椭圆答案

【问题标题】：Plotting ellipses based on subset of data基于数据子集绘制椭圆
【发布时间】：2020-09-30 14:14:59
【问题描述】：

我正在比较 R 中两种鱼类之间的同位素数据。具体来说，我想查看生态位大小（大致由点周围的椭圆大小推断）。我附上了一个情节来展示我正在制作的情节（情节上有三条鱼，但统计上忽略了一条）Isotope plot

问题是，物种 1（蓝枪鱼）的样本量为 68，物种 2（条纹马林鱼）的样本量为 15，我需要检查不同的样本量对 95% 置信度大小的影响椭圆。

通常如何解决这个问题是基本上将我为物种 2 拥有的 15 个样本与来自物种 1 的 15 个样本的子集进行对比。如果椭圆的线稍微透明，并且我将它绘制了 100 次，则较暗的区域将揭示“真正的利基”。

我已经设法绘制了我的点，但不知道如何根据我的数据的随机子集绘制 100 个椭圆？

我的数据列在以下列中：'fish.id' - 有蓝枪鱼 (BM) 和条纹枪鱼 (SM)； '15N' - 有我的氮值；和 '13C' - 有我的碳值。

`fish.id`  `15N` `13C`
   <chr> <dbl> <dbl>
 1 BkM6F    14    15
 2 BkM7F    10    16
 3 BkM8F    11    16
 4 BkM9F    11    18
 5 BM12F    13    17
 6 BM14F    13    20
 7 BM17F    11    17
 8 BM18F    15    19
 9 BM19F    13    17
10 BM22F    13    16
# … with 79 more rows

这是我目前所拥有的：

#plot with all points
plot(Isotope$`13C`,Isotope$`15N`,type='n', 
     main = 'Marlin 13C vs 15N', xlab = '13C', ylab = '15N')
points(Isotope$`13C`[substr(Isotope$fish.id,1,2)=='Bk'], 
       Isotope$`15N`[substr(Isotope$fish.id,1,2)=='Bk'], pch=15, col = 'black')
points(Isotope$`13C`[substr(Isotope$fish.id,1,2)=='BM'], 
       Isotope$`15N`[substr(Isotope$fish.id,1,2)=='BM'], pch=16, col = 'dodgerblue3')
points(Isotope$`13C`[substr(Isotope$fish.id,1,2)=='SM'], 
       Isotope$`15N`[substr(Isotope$fish.id,1,2)=='SM'], pch=17, col = 'cadetblue3')

#attempt at subsetting data
plot( )

for(i in 1:1000) {

  sampl1 <- sample(1:15,size=15,replace=TRUE)
  sampl2 <- sample(1:69,size=15,replace=TRUE)

  temp.data1 <- data1[sampl1,]
  temp.data2 <- data2[sampl2,]

}

为帮助干杯！

【问题讨论】：

如果没有示例数据，我们无法为您演示解决方案，但您需要 car 或 ellipse 包中的 ellipse() 函数，并且您需要使用 @ 为颜色添加透明度rgb() 函数中的 987654328@ 参数。
嘿 dcarlson，为您的帮助喝彩！我将在帖子中添加一些示例数据。我一直在使用 SIBER 的 addEllipse 函数，但可能会尝试您建议的那些。
使用dput(as.data.frame(Isotope)) 并为每个物种或整个事物粘贴 15 行。如果您在数据中仅使用物种标签创建species 列，您的编码会更简单。仅此一项就会将您的情节陈述减少到几行。然后使用C13 和N14 作为您的同位素变量名称。这样可以节省大量输入，因为您不需要用name 将名称括起来。 SIBER 包附带多个 vignettes 说明它的用途，可以让您更轻松地做事。

标签： r plot statistics data-science

【解决方案1】：

这里有一些让你开始的东西，它考虑了我早期的 cmets。首先，可重复的数据，所以我们说的是同一件事：

dat <- structure(list(species = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 
    1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
    2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L), .Label = c("Bk", "BM"), class = "factor"), 
    N15 = c(13.06, 11.57, 13.28, 13.45, 11.91, 12.01, 14.04, 
    11.97, 13.46, 11.27, 13.37, 14.23, 11.32, 11.58, 12.1, 12.42, 
    12.63, 11.52, 13.72, 12.3, 14.49, 14.01, 13.24, 15.03, 12.93, 
    12.71, 13.1, 12.26, 14.74, 14.18), C13 = c(17.51, 15.37, 
    15.4, 15.73, 16.84, 15.79, 16.78, 15.85, 18.32, 16.61, 17.07, 
    18.05, 14.08, 15.9, 15.65, 18.27, 18.95, 18.46, 17.58, 18.24, 
    16.28, 17.54, 16.65, 15.51, 18, 18.09, 17.42, 18.68, 18.38, 
    16.62)), class = "data.frame", row.names = c(NA, -30L))
str(dat)
'data.frame':   30 obs. of  3 variables:
 $ species: Factor w/ 2 levels "Bk","BM": 1 1 1 1 1 1 1 1 1 1 ...
 $ N15    : num  13.1 11.6 13.3 13.4 11.9 ...
 $ C13    : num  17.5 15.4 15.4 15.7 16.8 ...

将species 设为因子可以简化使用它来指定颜色、符号：

idx <- as.numeric(dat$species)
sym <- 15:16
cls <- c("dodgerblue3", "cadetblue3")
col1 <- c(col2rgb(cls[1]))
col1 <- rgb(col1[1], col1[2], col1[3], 64, maxColorValue=255) # 25% transparency
col2 <- c(col2rgb(cls[2]))
col2 <- rgb(col2[1], col2[2], col2[3], 64, maxColorValue=255)
alpha.col <- c(col1, col2)

这将为每个组创建符号、为每个组创建颜色和透明颜色：

library(car)
plot(N15~C13, dat, xlim=c(12, 21), ylim=c(10, 17), xlab="13C", ylab="15N", pch=sym[idx], col=cls[idx])
dataEllipse(dat$C13, dat$N15, groups=dat$species, levels=.95, center.cex=1.25, add=TRUE, plot.points=FALSE,
     col=alpha.col)

这将创建以下情节：

【讨论】：

非常感谢！事后看来，我设定情节的方式真的很混乱，嗯，吸取教训:)