【问题标题】:Plotting every three rows from data frame从数据框中每三行绘制一次
【发布时间】:2020-02-06 09:49:16
【问题描述】:

我想根据我的数据制作一些图。不幸的是,很难预测我会生成多少图,因为它取决于数据并且可能会有所不同。这就是我想让它易于调节的原因。但是,它通常是每次 3 行组中的一个图。

所以,我想从1:34-67-9 等行绘制。

这是数据:

> dput(DF_final)
structure(list(AC = c(0.0031682160632777, 0.00228591145206846, 
0.00142094444568728, 0.000661218113472149, 0.0010078157353918, 
0.000400289437089513, 40.4634784175177, 40.5055070858594, 0.0183737773741582
), SD = c(0.00250647379467532, 0.0013244185401148, 0.000469332241199189, 
0.000294558308707343, 0.000385553400676202, 0.000104447914881357, 
11.0693842400794, 8.78768774254084, 0.00696532251341454), ln_AC = c(-5.75458660556339, 
-6.08099044923792, -6.556433525855, -7.32142679754668, -6.89996992823399, 
-7.8233226797995, 3.70039979980691, 3.70143794229703, -3.99683077355773
), ln_SD = c(-5.98887837626238, -6.62678175351058, -7.66419963690747, 
-8.13003358225542, -7.86083085139947, -9.16682203300101, 2.40418312097106, 
2.17335162163583, -4.96681136795312), Percent_AC = c(126.401324043689, 
172.597361244303, 302.758754023937, 224.477834753288, 261.394591157605, 
383.243109777925, 365.544076706723, 460.934756361151, 263.789326894369
), Percent_SD = c(100, 100, 100, 100, 100, 100, 100, 100, 100
), TP = c(0, 40, 80, 0, 40, 80, 0, 40, 80)), row.names = c("Tim_0", 
"Tim_40", "Tim_80", "Jack_0", "Jack_40", "Jack_80", "Tom_0", 
"Tom_40", "Tom_80"), class = "data.frame")

ln_AC 列应设置为 Y 轴,TP 列应设置为 X 轴。首先,我希望将所有这些都放在彼此相邻的单独图表上(请记住,在某些时候绘图的数量可能会很高),如果可能的话,所有内容都在同一个图表上。它应该是带有趋势线的点图。

是否也可以从线性回归得到一个斜率,SD斜率,R^2?

我设法为单个图做到了,但回归线看起来很奇怪......

以下代码用于生成此图和回归线。

fit <- lm(DF_final$ln_AC~DF_final$TP, data=DF_final)
plot(DF_final[1:3,7], DF_final[1:3,3], type = "p", ylim = c(-10,0), xlim=c(0,100), col = "red")
lines(DF_final$TP, fitted(fit), col="blue")

【问题讨论】:

    标签: r


    【解决方案1】:

    在 base R 中(没有这么多包),你可以这样做:

    # splits every 3 rows
    DF = split(DF_final,gsub("_[^ ]*","",rownames(DF_final) ))
    # you can also do
    # DF = split(DF_final,(1:nrow(DF_final) - 1) %/%3 ))
    

    存储您的值:

    slopes =  vector("numeric",3)
    names(slopes) = names(DF)
    rsq = vector("numeric",3)
    names(rsq) = names(DF)
    

    绘制:

    par(mfrow=c(1,3))
    for(i in names(DF)){
    fit <- lm(ln_AC~TP, data=DF[[i]])
    plot(DF[[i]]$TP, DF[[i]]$ln_AC, type = "p", col = "red",main=i)
    abline(fit, col="blue")
    slopes[i]=round(fit$coefficients[2],digits=2)
    rsq[i]=round(summary(fit)$r.squared,digits=2)
    mtext(side=1,paste("slope=",slopes[i],"\nrsq=",rsq[i]),
    padj=-2,cex=0.7)
    }
    

    还有你的价值观:

    slopes
     Jack   Tim   Tom 
    -0.01 -0.01 -0.10 
    rsq
    Jack  Tim  Tom 
    0.29 0.99 0.75 
    

    【讨论】:

      【解决方案2】:

      如果我理解正确,您希望每张图进行 3 次观察的原因是因为您有不同的人 (Jack,Tim,Tom) 。是这样吗? 如果你不想担心这个数字,你可以这样做

      # move rownames to column  
      data$person <- rownames(data)
      data$person <- gsub("\\_.*","",data$person)  # remove TP from names
      
      # better to use library(data.table) for this step
      data <- melt(data,id.vars=c("person","TP","ln_AC"))
      
      ggplot(data,aes(x=TP, y=ln_AC)) + geom_point() +
           geom_smooth(method = "lm") + facet_grid(~person)
      

      这会产生类似@giocomai 的图,但如果您的数据中有 4、5、6 或任何人,它也可以工作。

      ---- 编辑

      如果你想添加 R2 值,你可以这样做。请注意,它可能不是最好和优雅的解决方案,但它确实有效。

      data <- data.frame(...)
      data$person <- rownames(data)
      data$person <- gsub("\\_.*","",data$person)  
      
      # run lm for all persons and save them in a data.frame
      nomi <- unique(data$person)
      #lmStats <- data.frame()
      lmStats <- sapply(nomi, 
         function(ita){
            model <- lm(ln_AC~TP,data= data[which(data$person == ita),])
            lmStat <- summary(model)
            # I only save r2, but you can get all the statistics you need
            lmRow <- data.frame("r2" = lmStat$r.squared )
            #lmStats <- rbind(lmStats,lmRow)
         }
      )
      lmStats <- do.call(rbind,lmStats)
      
      # format the output,and create a dataframe we will use to annotate facet_grid
      lmStats <- as.data.frame(lmStats)
      rownames(lmStats) <- gsub("\\..*","",rownames(lmStats))
      lmStats$person <- rownames(lmStats)
      colnames(lmStats)[1] <- "r2"
      lmStats$r2 <- round(lmStats$r2,2)
      lmStats$TP <- 40
      lmStats$ln_AC <- 0
      lmStats$lab <- paste0("r2= ",lmStats$r2)
      
      # melt and add r2 column to the data (not necessary, but I like to have everything I plot in teh data)
      data <- melt(data,id.vars=c("person","TP","ln_AC"))
      data$r2 <- lmStats[match(data$person,rownames(lmStats)),1] 
      
      
      ggplot(data,aes(x=TP, y=ln_AC)) + geom_point() +
         geom_smooth(method = "lm") + facet_grid(~person) +
         geom_text(data=lmStats,label=lmStats$lab)
      

      一种更简单的方法(更少的步骤)是使用facet_grid(~r2),这样您就可以在标题中使用 R.square 值。

      【讨论】:

      • 这也可以,但我们仍然有提取斜率值并将它们存储在变量中的问题......
      【解决方案3】:

      如果我正确理解您的意思,假设您将始终对每个图进行三个观察,那么您的主要问题将是创建一个分类变量来将它们分开。这是实现它的一种方法。根据您喜欢的布局,您可能需要检查facet_wrap 而不是facet_grid

      library("dplyr")
      library("ggplot2")
      DF_final <- structure(list(AC = c(0.0031682160632777, 0.00228591145206846, 
                            0.00142094444568728, 0.000661218113472149, 0.0010078157353918, 
                            0.000400289437089513, 40.4634784175177, 40.5055070858594, 0.0183737773741582
      ), SD = c(0.00250647379467532, 0.0013244185401148, 0.000469332241199189, 
                0.000294558308707343, 0.000385553400676202, 0.000104447914881357, 
                11.0693842400794, 8.78768774254084, 0.00696532251341454), ln_AC = c(-5.75458660556339, 
                                                                                    -6.08099044923792, -6.556433525855, -7.32142679754668, -6.89996992823399, 
                                                                                    -7.8233226797995, 3.70039979980691, 3.70143794229703, -3.99683077355773
                ), ln_SD = c(-5.98887837626238, -6.62678175351058, -7.66419963690747, 
                             -8.13003358225542, -7.86083085139947, -9.16682203300101, 2.40418312097106, 
                             2.17335162163583, -4.96681136795312), Percent_AC = c(126.401324043689, 
                                                                                  172.597361244303, 302.758754023937, 224.477834753288, 261.394591157605, 
                                                                                  383.243109777925, 365.544076706723, 460.934756361151, 263.789326894369
                             ), Percent_SD = c(100, 100, 100, 100, 100, 100, 100, 100, 100
                             ), TP = c(0, 40, 80, 0, 40, 80, 0, 40, 80)), row.names = c("Tim_0", 
                                                                                        "Tim_40", "Tim_80", "Jack_0", "Jack_40", "Jack_80", "Tom_0", 
                                                                                        "Tom_40", "Tom_80"), class = "data.frame")
      DF_final %>% 
        mutate(id = as.character(sapply(1:(nrow(DF_final)/3), rep, 3))) %>% 
        ggplot(aes(x=TP, y=ln_AC)) +
        geom_point() +
        geom_smooth(method = "lm") +
        facet_grid(~id)
      

      reprex package (v0.3.0) 于 2020 年 2 月 6 日创建

      【讨论】:

      • 看起来不错。我会玩图形,会没事的。两个问题。是否可以在图表上显示方程和 R2?是否有机会将斜率值存储在变量中?我需要它来做进一步的计算。
      猜你喜欢
      • 2020-10-28
      • 2019-03-06
      • 1970-01-01
      • 2020-08-06
      • 2019-11-24
      • 2021-10-03
      • 1970-01-01
      • 2023-03-29
      • 2019-04-19
      相关资源
      最近更新 更多