【问题标题】:Multiple linear regression for a dataset in R with ggplot2使用 ggplot2 对 R 中的数据集进行多元线性回归
【发布时间】:2015-10-03 07:16:21
【问题描述】:

我正在测试对数据集的情绪进行分析。在这里,我想看看在消息量和嗡嗡声、消息量和分数之间是否有任何有趣的观察...

这是我的数据集的样子:

> str(data)
'data.frame':   40 obs. of  11 variables:
 $ Date Time   : POSIXct, format: "2015-07-08 09:10:00" "2015-07-08 09:10:00" ...
 $ Subject     : chr  "MMM" "ACE" "AES" "AFL" ...
 $ Sscore      : chr  "-0.2280" "-0.4415" "1.9821" "-2.9335" ...
 $ Smean       : chr  "0.2593" "0.3521" "0.0233" "0.0035" ...
 $ Svscore     : chr  "-0.2795" "-0.0374" "1.1743" "-0.2975" ...
 $ Sdispersion : chr  "0.375" "0.500" "1.000" "1.000" ...
 $ Svolume     : num  8 4 1 1 5 3 2 1 1 2 ...
 $ Sbuzz       : chr  "0.6026" "0.7200" "1.9445" "0.8321" ...
 $ Last close  : chr  "155.430000000" "104.460000000" "13.200000000" "61.960000000" ...
 $ Company name: chr  "3M Company" "ACE Limited" "The AES Corporation" "AFLAC Inc." ...
 $ Date        : Date, format: "2015-07-08" "2015-07-08" ...

我考虑过线性回归,所以我想使用 ggplot,但我使用了这段代码,我认为我在某个地方出错了,因为我没有出现的回归线......是因为回归是为了虚弱的?我帮助了代码:code of topchef

我的是:

library(ggplot2)
require(ggplot2)
library("reshape2")
require(reshape2)
data.2 = melt(data[3:9], id.vars='Svolume')
ggplot(data.2) +
  geom_jitter(aes(value,Svolume, colour=variable),) + geom_smooth(aes(value,Svolume, colour=variable), method=lm, se=FALSE) +
  facet_wrap(~variable, scales="free_x") +
  labs(x = "Variables", y = "Svolumes")

但我可能误解了一些东西,因为我没有得到我想要的东西。 我对 R 很陌生,所以我希望有人能帮助我。

我有这个错误:

    geom_smooth: Only one unique x value each group.Maybe you want aes(group = 1)?
geom_smooth: Only one unique x value each group.Maybe you want aes(group = 1)?
geom_smooth: Only one unique x value each group.Maybe you want aes(group = 1)?
geom_smooth: Only one unique x value each group.Maybe you want aes(group = 1)?
geom_smooth: Only one unique x value each group.Maybe you want aes(group = 1)?
geom_smooth: Only one unique x value each group.Maybe you want aes(group = 1)?

最后,您是否认为可以为不同的主题使用不同的颜色,而不是每个变量使用一种颜色? 我可以在每张图上添加回归线吗?

感谢您的帮助。

样本数据:

       Date Time Subject  Sscore  Smean Svscore Sdispersion Svolume  Sbuzz    Last close        Company name       Date
1  2015-07-08 09:10:00     MMM -0.2280 0.2593 -0.2795       0.375       8 0.6026 155.430000000          3M Company 2015-07-08
2  2015-07-08 09:10:00     ACE -0.4415 0.3521 -0.0374       0.500       4 0.7200 104.460000000         ACE Limited 2015-07-08
3  2015-07-07 09:10:00     AES  1.9821 0.0233  1.1743       1.000       1 1.9445  13.200000000 The AES Corporation 2015-07-07
4  2015-07-04 09:10:00     AFL -2.9335 0.0035 -0.2975       1.000       1 0.8321  61.960000000          AFLAC Inc. 2015-07-04
5  2015-07-07 09:10:00     MMM  0.2977 0.2713 -0.7436       0.400       5 0.4895 155.080000000          3M Company 2015-07-07
6  2015-07-07 09:10:00     ACE -0.2331 0.3519 -0.1118       1.000       3 0.7196 103.330000000         ACE Limited 2015-07-07
7  2015-06-28 09:10:00     AES  1.8721 0.0609  1.9100       0.500       2 2.4319  13.460000000 The AES Corporation 2015-06-28
8  2015-07-03 09:10:00     AFL  0.6024 0.0330 -0.2663       1.000       1 0.6822  61.960000000          AFLAC Inc. 2015-07-03
9  2015-07-06 09:10:00     MMM -1.0057 0.2579 -1.3796       1.000       1 0.4531 155.380000000          3M Company 2015-07-06
10 2015-07-06 09:10:00     ACE -0.0263 0.3435 -0.1904       1.000       2 1.3536 103.740000000         ACE Limited 2015-07-06
11 2015-06-19 09:10:00     AES -1.1981 0.1517  1.2063       1.000       2 1.9427  13.850000000 The AES Corporation 2015-06-19
12 2015-07-02 09:10:00     AFL -0.8247 0.0269  1.8635       1.000       5 2.2454  62.430000000          AFLAC Inc. 2015-07-02
13 2015-07-05 09:10:00     MMM -0.4272 0.3107 -0.7970       0.167       6 0.6003 155.380000000          3M Company 2015-07-05
14 2015-07-04 09:10:00     ACE  0.0642 0.3274 -0.0975       0.667       3 1.2932 103.740000000         ACE Limited 2015-07-04
15 2015-06-17 09:10:00     AES  0.1627 0.1839  1.3141       0.500       2 1.9578  13.580000000 The AES Corporation 2015-06-17
16 2015-07-01 09:10:00     AFL -0.7419 0.0316  1.5699       0.250       4 2.0988  62.200000000          AFLAC Inc. 2015-07-01
17 2015-07-04 09:10:00     MMM -0.5962 0.3484 -1.2481       0.667       3 0.4496 155.380000000          3M Company 2015-07-04
18 2015-07-03 09:10:00     ACE  0.8527 0.3085  0.1944       0.833       6 1.3656 103.740000000         ACE Limited 2015-07-03
19 2015-06-15 09:10:00     AES  0.8145 0.1725  0.2939       1.000       1 1.6121  13.350000000 The AES Corporation 2015-06-15
20 2015-06-30 09:10:00     AFL  0.3076 0.0538 -0.0938       1.000       1 0.7071  61.440000000          AFLAC Inc. 2015-06-30

输入

data <- structure(list(`Date Time` = structure(c(1436361000, 1436361000, 
1436274600, 1436015400, 1436274600, 1436274600, 1435497000, 1435929000, 
1436188200, 1436188200, 1434719400, 1435842600, 1436101800, 1436015400, 
1434546600, 1435756200, 1436015400, 1435929000, 1434373800, 1435669800
), class = c("POSIXct", "POSIXt"), tzone = ""), Subject = c("MMM", 
"ACE", "AES", "AFL", "MMM", "ACE", "AES", "AFL", "MMM", "ACE", 
"AES", "AFL", "MMM", "ACE", "AES", "AFL", "MMM", "ACE", "AES", 
"AFL"), Sscore = c(-0.228, -0.4415, 1.9821, -2.9335, 0.2977, 
-0.2331, 1.8721, 0.6024, -1.0057, -0.0263, -1.1981, -0.8247, 
-0.4272, 0.0642, 0.1627, -0.7419, -0.5962, 0.8527, 0.8145, 0.3076
), Smean = c(0.2593, 0.3521, 0.0233, 0.0035, 0.2713, 0.3519, 
0.0609, 0.033, 0.2579, 0.3435, 0.1517, 0.0269, 0.3107, 0.3274, 
0.1839, 0.0316, 0.3484, 0.3085, 0.1725, 0.0538), Svscore = c(-0.2795, 
-0.0374, 1.1743, -0.2975, -0.7436, -0.1118, 1.91, -0.2663, -1.3796, 
-0.1904, 1.2063, 1.8635, -0.797, -0.0975, 1.3141, 1.5699, -1.2481, 
0.1944, 0.2939, -0.0938), Sdispersion = c(0.375, 0.5, 1, 1, 0.4, 
1, 0.5, 1, 1, 1, 1, 1, 0.167, 0.667, 0.5, 0.25, 0.667, 0.833, 
1, 1), Svolume = c(8L, 4L, 1L, 1L, 5L, 3L, 2L, 1L, 1L, 2L, 2L, 
5L, 6L, 3L, 2L, 4L, 3L, 6L, 1L, 1L), Sbuzz = c(0.6026, 0.72, 
1.9445, 0.8321, 0.4895, 0.7196, 2.4319, 0.6822, 0.4531, 1.3536, 
1.9427, 2.2454, 0.6003, 1.2932, 1.9578, 2.0988, 0.4496, 1.3656, 
1.6121, 0.7071), `Last close` = c(155.43, 104.46, 13.2, 61.96, 
155.08, 103.33, 13.46, 61.96, 155.38, 103.74, 13.85, 62.43, 155.38, 
103.74, 13.58, 62.2, 155.38, 103.74, 13.35, 61.44), `Company name` = c("3M Company", 
"ACE Limited", "The AES Corporation", "AFLAC Inc.", "3M Company", 
"ACE Limited", "The AES Corporation", "AFLAC Inc.", "3M Company", 
"ACE Limited", "The AES Corporation", "AFLAC Inc.", "3M Company", 
"ACE Limited", "The AES Corporation", "AFLAC Inc.", "3M Company", 
"ACE Limited", "The AES Corporation", "AFLAC Inc."), Date = structure(c(16624, 
16624, 16623, 16620, 16623, 16623, 16614, 16619, 16622, 16622, 
16605, 16618, 16621, 16620, 16603, 16617, 16620, 16619, 16601, 
16616), class = "Date")), .Names = c("Date Time", "Subject", 
"Sscore", "Smean", "Svscore", "Sdispersion", "Svolume", "Sbuzz", 
"Last close", "Company name", "Date"), row.names = c("1", "2", 
"3", "4", "5", "6", "7", "8", "9", "10", "11", "12", "13", "14", 
"15", "16", "17", "18", "19", "20"), class = "data.frame")

【问题讨论】:

  • 可以添加一些示例数据吗?
  • 完成。我想出了如何只取我感兴趣的变量(编辑)。我现在只需要知道为什么我的图表上没有回归线以及如何在图表上或下方添加回归表达式?使用 lm(otherVariables ~ data$Svolume)。谢谢
  • 我没有收到该错误,您的代码工作正常。对于第二个问题,您需要将主题添加到融化data.3 = melt(data[, 2:9], id.vars = c('Subject','Svolume')),然后将colour=variable 更改为主题

标签: r ggplot2 regression


【解决方案1】:

注意警告Maybe you want aes(group = 1)。我所做的只是将group = 1 添加到aes 以获取geom_smooth

ggplot(data.2) +
  geom_jitter(aes(value,Svolume, colour=variable),) + 
  geom_smooth(aes(value,Svolume, colour=variable, group = 1), method=lm, se=FALSE) +
  facet_wrap(~variable, scales="free_x") +
  labs(x = "Variables", y = "Svolumes")

一些不请自来的建议

以下是我编写 ggplot 代码的方式:

library(ggplot2)
require(reshape2)

data.2 = melt(data[3:9], id.vars='Svolume')

ggplot(data.2) +
  aes(x = value, y = Svolume, colour = variable) +
  geom_jitter() +
  geom_smooth(method=lm, se=FALSE, aes(group = 1)) +
  facet_wrap(~variable, scales="free_x") +
  labs(x = "Variables", y = "Svolumes")

【讨论】:

  • 非常感谢您提供的解决方案和提示。对不起,我对这个论坛很陌生。我会读你寄给我的东西。另外,请问如何为每张图添加回归表达式?
  • 我认为您需要手动查找回归系数,使用lm,然后使用geom_text 层。那将是我知道的唯一方法。另外,欢迎来到 SO!
猜你喜欢
  • 2015-07-21
  • 2021-11-26
  • 2020-12-28
  • 2022-01-11
  • 2019-09-11
  • 2018-08-11
  • 2013-07-14
  • 2016-04-19
  • 1970-01-01
相关资源
最近更新 更多