【问题标题】:Wrong area on normal curve plot正态曲线图上的错误区域
【发布时间】:2017-09-19 14:06:49
【问题描述】:

我正在尝试从头开始学习 R,我刚刚交付了一项大学作业,用于检验我使用 R 来求解和绘制的二项分布(一个样本的比例检验)。但是我遇到了一些问题。

我的样本量是 130,成功案例是 68。

  • H0:π = 50%
  • H1:π > 50

这是我使用的代码(大量复制粘贴和试验/错误)

library(ggplot2)
library(ggthemes)
library(scales)


#data

n = 130
p = 1/2
stdev = sqrt(n*p*(1-p))
mean_binon = n*p
cases = 68
ztest = (cases-mean_binon)/stdev
pvalor = pnorm(-abs(ztest))
zcrit = qnorm(0.975)

#normal curve
xvalues <- data.frame(x = c(-4, 4))

#first plots and lines
p1 <- ggplot(xvalues, aes(x = xvalues))
p2 <- p1 + stat_function(fun = dnorm) + xlim(c(-4, 4)) +
    geom_vline(xintercept = ztest, linetype="solid", color="blue", 
               size=1) +
    geom_vline(xintercept = zcrit, linetype="solid", color="red", 
                   size=1)


#z area function
area_z <- function(x){
    norm_z <- dnorm(x)
    norm_z[x < ztest] <- NA
    return(norm_z)
}

#critical z area function
area_zc <- function(x){
    norm_zc <- dnorm(x)
    norm_zc[x < zcrit] <- NA
    return(norm_zc)
}


#area value
valor_area_z <- round(pnorm(4) - pnorm(ztest), 3)
valor_area_zc <- round(pnorm(4) - pnorm(zcrit), 3)


#final plot

p3 <- p2 + stat_function(fun = dnorm) + 
    stat_function(fun = area_z, geom = "area", fill = "blue", alpha = 0.3) +
    geom_text(x = 1.13, y = 0.1, size = 5, fontface = "bold",
              label = paste0(valor_area_z * 100, "%")) +
    stat_function(fun = area_zc, geom = "area", fill = "red", alpha = 0.5) +
    geom_text(x = 2.27, y = 0.015, size = 3, fontface = "bold",
              label = paste0(valor_area_zc * 100, "%")) +
    scale_x_continuous(breaks = c(-3:3)) + 
    labs(x = "\n z", y = "f(z) \n", title = "Distribuição Normal \n") +
    theme_fivethirtyeight()

p3

剧情是这样的

我的 geom_vline 和阴影区域之间有一个间隙。我不确定我是否对统计数据执行了错误的步骤,或者这是与 R 相关的问题。也许两者兼而有之?对不起,如果这是初级的。这两方面我都不擅长,但我正在努力提高。

【问题讨论】:

  • 如果您不舍入 valor_area_z 会发生什么?
  • @MichaelChirico 此函数仅用于显示区域的文本值。不过还是谢谢!

标签: r ggplot2 area normal-distribution


【解决方案1】:

一种解决方案是在stat_function 中使用选项xlim,它定义了函数的范围。
您还可以将area_zarea_zc 替换为dnorm

p3 <- p2 + stat_function(fun = dnorm) + 
    stat_function(fun = dnorm, geom = "area", fill = "blue", alpha = 0.3, 
                  xlim = c(ztest,zcrit)) +
    geom_text(x = 1.13, y = 0.1, size = 5, fontface = "bold",
              label = paste0(valor_area_z * 100, "%")) +
    stat_function(fun = dnorm, geom = "area", fill = "red", alpha = 0.5, 
                  xlim = c(zcrit,xvalues$x[2])) +
    geom_text(x = 2.27, y = 0.015, size = 3, fontface = "bold",
              label = paste0(valor_area_zc * 100, "%")) +
    scale_x_continuous(breaks = c(-3:3)) + 
    labs(x = "\n z", y = "f(z) \n", title = "Distribuição Normal \n") +
    theme_fivethirtyeight()

p3

【讨论】:

  • 哇!非常感谢@marco!我将尝试这两个示例并阅读函数文档以更好地理解。所以问题是由于我限制范围的方式,我定义的函数给出了不正确/不精确的值? norm_z[x
猜你喜欢
  • 2018-10-20
  • 2018-11-01
  • 1970-01-01
  • 2018-08-15
  • 2023-03-14
  • 1970-01-01
  • 2021-04-08
  • 1970-01-01
  • 1970-01-01
相关资源
最近更新 更多