跨多个变量循环 ggplot答案

【问题标题】：Looping ggplot across many variables跨多个变量循环 ggplot
【发布时间】：2021-12-31 14:58:17
【问题描述】：

############ 编辑 ############

我使用这个 info_mat 来计算进化率。

date1 <- rbind("February", "March",   "April",     "May",       "June",      "July",      "August",    "September", "October",   "November")
sum1.visit_bush. <- rbind("0", "0" ,"1"  , "-0.75" ,"2","0" ,"0.333333333333333" , "1.25"  , "0",  "-1")
sum1.counts_bush. <- rbind("0" ,"0.115290451813933", "-0.557273997206146", "0.146270002253775" ,  "0.100865119937082", "0.512412930880514",  "0.435049598488427",   "-0.0831961816984858", "0.824791311372408",  "-0.156025577963601" )
sum1.hcounts_bush. <- rbind("0",  "0.0387010676156584", "-0.625695931477516", "0.47254004576659",  "-0.233100233100233", "0.99290780141844" ,  "-0.032536858159634" , "0.349973725696269" , "0.660957571039315",  "-0.341223341926412")
evolution1 <- data.frame(date1, sum1.visit_bush., sum1.counts_bush., sum1.hcounts_bush.)

然后我按照你的建议进行

df_month_cand <- evolution1 %>% select(c("date", paste0(c("sum.visit_", "sum.counts_", "sum.hcounts_"), "bush.")))
df_month_cand_plot <- melt(df_month_cand, id.vars = "date", variable.name = "Type", value.name = "y")

FunctionPlot <- function(cand, evolution) {
  df_month_cand <- evolution %>% select(c("date1", paste0(c("sum1.visit_", "sum1.counts_", "sum1.hcounts_"), cand)))
  df_month_cand_plot <- melt(df_month_cand, id.vars = "date1", variable.name = "Type", value.name = "y")
  
  p <- ggplot(df_month_cand_plot, aes(x = date1, y = y, color = Type)) + geom_point() + geom_line(aes(group=Type)) +
    labs(
      title = paste0("Evolution of visits and coverage
      per month for ", cand) ,
      subtitle = "We read: from March to April, whereas the visits of -candidate- increased by -value*100 %-, 
    the coverage in newspapers decreased by -value*100 %-",
      color="Type", 
      x="Months", 
      y="Percentage change over months") + 
    theme(
      plot.title = element_text(size=15, face="bold", margin = margin(5, 0, 10, 10), vjust=2, hjust=0.5),
      axis.text.x=element_text(angle=50, size=11.5, vjust=0.5),
      axis.title.y = element_text(vjust=4),
      plot.margin = unit(c(1, 0.3, 0.5, 0.6), "cm"),
      legend.position = "bottom", 
      legend.box.background = element_rect(color="black", size=2), 
      legend.title = element_text(face = "bold", size=10), 
      legend.background = element_rect(fill="grey90",
                                       size=0.5, linetype="solid", 
                                       colour ="black"), 
      panel.background = element_rect(fill = "gray90", colour = "gray70", size = 0.5, linetype = "solid"),
      panel.grid.major = element_line(size = 0.5, linetype = 'dashed', colour = "gray75")) +
    scale_color_manual(labels = c("Visits", "Main text count", "Headline count"), values = c("tomato3", "deepskyblue3", "green2")) + 
    scale_x_discrete(limits = c("February", "March", "April", "May", "June", "July", "August", "September", "October", "November")) + 
    scale_y_discrete()
  plot(p)
}

sapply("bush.", FunctionPlot, evolution1)

但是，在输出中，y 轴完全混乱了。这些值不是从小到大排序的。为什么？如何解决？

最后，为了简化我想要划分的 y 轴是从 -1 到 2，间隔为 0.25 我试过了

scale_y_continuous(breaks=seq(-1, 2, 0.25))

但我有以下错误代码：错误：提供给连续刻度的离散值

谢谢！！！！

【问题讨论】：

info_mat 来自哪里？有没有办法让您的问题可重现？
是的，请看我的编辑。

标签： r loops ggplot2 iteration

【解决方案1】：

您可以将变量 date 从字符转换为日期格式：

date <- as.Date(date, format = "%Y-%d-%m")

ggplot 可以在 X 轴上打印日期。现在您无需手动创建可变月份。
我认为您应该使用 data.frame：

df_info <- data.frame(
    date = date,
    a1 = c(0, 0, 0, 0, 6421, 41, 5667, 44, 1178, 0, 1070, 1),
    b1 = c(1, 1, 1, 1, 6421, 41, 5667, 44, 1178, 0, 1070, 1),
    hb1 = c(2, 2, 2, 2, 6421, 41, 5667, 44, 1178, 0, 1070, 1),
    a2 = c(0, 0, 0, 0, 6421, 41, 5667, 44, 1178, 0, 1070, 1),
    b2 = c(1, 1, 1, 1, 6421, 41, 5667, 44, 1178, 0, 1070, 1),
    hb2 = c(2, 2, 2, 2, 6421, 41, 5667, 44, 1178, 0, 1070, 1),
    a3 = c(0, 0, 0, 0, 6421, 41, 5667, 44, 1178, 0, 1070, 1),
    b3 = c(1, 1, 1, 1, 6421, 41, 5667, 44, 1178, 0, 1070, 1),
    hb3 = c(2, 2, 2, 2, 6421, 41, 5667, 44, 1178, 0, 1070, 1),
    a4 = c(0, 0, 0, 0, 6421, 41, 5667, 44, 1178, 0, 1070, 1),
    b4 = c(1, 1, 1, 1, 6421, 41, 5667, 44, 1178, 0, 1070, 1),
    hb4 = c(2, 2, 2, 2, 6421, 41, 5667, 44, 1178, 0, 1070, 1)
)

或者，如果您有大数据，您可以将矩阵转换为 data.frame，并将变量转换为数字格式（但没有名称）。

df_info <- bind_cols(data.frame(date = date), info_mat[,-1] %>% as.data.frame() %>% lapply(as.numeric)) %>% as.data.frame()

现在我们可以为第一个人选择列：

df <- df_info %>% select(c("date", paste0(c("a", "b", "hb"), 1)))

接下来我们将为绘图创建data.frame：

    df_plot <- melt(df, id.vars = "date", variable.name = "Type", value.name = "y")

您的绘图功能很好，您可以将其用于df_plot。现在让我们创建用于为固定数量的个体绘制数据的函数：

f <- function(num, df_info) {
    df <- df_info %>% select(c("date", paste0(c("a", "b", "hb"), num)))
    
    df_plot <- melt(df, id.vars = "date", variable.name = "Type", value.name = "y")
    
    p <- ggplot(df_plot, aes(x = date, y = y, color = `Type`)) + geom_point() +
        geom_line() +
        labs(
            title = "Evolution of a and b and c per months",
            subtitle = paste0("plot ", num),
            color="Type", 
            x = "Months", 
            y = "over months"
        )
    plot(p)
}

让我们对每个数量的个体应用我们的函数：

sapply(1:4, f, df_info)

或者

sapply(1:4, function(x) f(x, df_info))

但是您的数据规模很差。如果在同一个图上有 6421，则看不到 0 和 1 之间的差异。但我不知道你想用这些数据和图做什么。

【讨论】：

快到了！！而不是我的个人是 1:4 他们有名字。分别是“布什”、“奥巴马”、“特朗普”、“利伯曼”。根据您所说，我尝试了以下方法： sapply("bush.":"lieberman.", f info) 但这不起作用。之前的一切工作！
要明确。在数据框中，我的变量不是 a1, b1, ... 而是 a_bush。 ; b_bush。 ; hb_bush。等
除了这个语法 "bush.":"lieberman." 之外，问题不来自任何地方。如果我输入一个候选人，该功能就可以完美运行： sapply("bush.", f, info) 有效
我认为您需要将f 部分从paste0(c("a", "b", "hb"), num)) 更改为paste0(c("a", "b", "hb"), "_", num))，其中num 是人名（您可以在f 输入中重命名）。所以你可以运行sapply(c("bush", "lieberman"), f, df_info)（或“bush.”，最后带有点）。
谢谢，我还有一个问题，对不起，请查看编辑