【问题标题】:Why does my histogram does not have negative value on x axis?为什么我的直方图在 x 轴上没有负值?
【发布时间】:2021-07-03 10:51:04
【问题描述】:

我正在绘制一个显示变量分布的直方图。绘制在 x 轴上的变量包含负值,但在直方图上,这些值不存在。

这是重现数据集样本的代码:

structure(list(`Cash Flowth EURLast avail. yr` = c(2.355, 14.677, 
-7.923, 53.66, 0, 91.336, 111.12, 11.945, -0.069, 4.42, 58.943, 
14.687, 11.17, 32.825, -1432.259, 2.852, 34.489, 198.515, 77.64, 
1.195, -53.123, -24.501, 18.244, 18.438, 16.668, 343.301, 0, 
-32.001, 41.009, -3.509, 71.679, 33.581, 638.27, 0, -1.262, -0.853, 
380.624, 26.533, 1.65, -30.007, -709.602, 1.877, -0.498, 3.77, 
-27.749, 15.599, -69.519, 6.331, 0.277, -150.365), general_status = c("Failed", 
"Active", "Failed", "Active", "Failed", "Active", "Active", "Active", 
"Failed", "Active", "Active", "Active", "Active", "Active", "Failed", 
"Active", "Active", "Active", "Active", "Failed", "Failed", "Active", 
"Active", "Active", "Failed", "Active", "Failed", "Failed", "Active", 
"Active", "Active", "Active", "Active", "Failed", "Active", "Failed", 
"Active", "Active", "Active", "Failed", "Failed", "Active", "Active", 
"Active", "Active", "Failed", "Active", "Active", "Failed", "Failed"
)), row.names = c(NA, -50L), class = c("tbl_df", "tbl", "data.frame"
))

这是我绘制直方图的代码:

df %>%
  filter(!is.na(`Cash Flowth EURLast avail. yr`)) %>%
  ggplot(aes(x = `Cash Flowth EURLast avail. yr`, fill = as.factor(general_status))) +
  geom_histogram(
      bins = nclass.Sturges(`Cash Flowth EURLast avail. yr`),colour = "black", position="identity")+
  scale_fill_manual(values = c("Active" = "springgreen4", "Failed" = "firebrick3"))+
  theme(legend.position="None", strip.background = element_rect(colour="black",
                                        fill="white"))+
  facet_grid(~general_status)

我该如何解决这个问题? 知道min = -901535max = 8009206

【问题讨论】:

    标签: r ggplot2 histogram


    【解决方案1】:

    这可能不能作为答案,但很难作为评论来解释。

    你的变量范围是

    range(df$`Cash Flowth EURLast avail. yr`)
    [1] -1432.259   638.270
    

    当 x 轴范围太高时,您看不到实际存在的负值。您不能指定 xlim(-1500, 650) 来解决此问题。

    另外,你的代码在我的电脑上不起作用。我替换了bins = nclass.Sturges(Cash Flowth EURLast avail. yr)`

    df %>%
      filter(!is.na(`Cash Flowth EURLast avail. yr`)) %>%
      ggplot(aes(x = `Cash Flowth EURLast avail. yr`, fill = as.factor(general_status))) +
      geom_histogram(
        bins = nclass.Sturges(df$`Cash Flowth EURLast avail. yr`), colour = "black", position="identity")+
      scale_fill_manual(values = c("Active" = "springgreen4", "Failed" = "firebrick3"))+
      theme(legend.position="None", strip.background = element_rect(colour="black",
                                                                    fill="white"))+
      facet_grid(~general_status)
    

    【讨论】:

    • 我猜整个问题的原因是 nclass.Sturges(`Cash Flowth EURLast avail. yr`) 部分:在全局环境中可能有一个名为 Cash Flowth EURLast avail. yr 的变量,它被用于 bin 而不是中的列df.
    猜你喜欢
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2013-04-16
    • 2017-04-21
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2020-10-27
    相关资源
    最近更新 更多