【发布时间】:2015-10-03 07:16:21
【问题描述】:
我正在测试对数据集的情绪进行分析。在这里,我想看看在消息量和嗡嗡声、消息量和分数之间是否有任何有趣的观察...
这是我的数据集的样子:
> str(data)
'data.frame': 40 obs. of 11 variables:
$ Date Time : POSIXct, format: "2015-07-08 09:10:00" "2015-07-08 09:10:00" ...
$ Subject : chr "MMM" "ACE" "AES" "AFL" ...
$ Sscore : chr "-0.2280" "-0.4415" "1.9821" "-2.9335" ...
$ Smean : chr "0.2593" "0.3521" "0.0233" "0.0035" ...
$ Svscore : chr "-0.2795" "-0.0374" "1.1743" "-0.2975" ...
$ Sdispersion : chr "0.375" "0.500" "1.000" "1.000" ...
$ Svolume : num 8 4 1 1 5 3 2 1 1 2 ...
$ Sbuzz : chr "0.6026" "0.7200" "1.9445" "0.8321" ...
$ Last close : chr "155.430000000" "104.460000000" "13.200000000" "61.960000000" ...
$ Company name: chr "3M Company" "ACE Limited" "The AES Corporation" "AFLAC Inc." ...
$ Date : Date, format: "2015-07-08" "2015-07-08" ...
我考虑过线性回归,所以我想使用 ggplot,但我使用了这段代码,我认为我在某个地方出错了,因为我没有出现的回归线......是因为回归是为了虚弱的?我帮助了代码:code of topchef
我的是:
library(ggplot2)
require(ggplot2)
library("reshape2")
require(reshape2)
data.2 = melt(data[3:9], id.vars='Svolume')
ggplot(data.2) +
geom_jitter(aes(value,Svolume, colour=variable),) + geom_smooth(aes(value,Svolume, colour=variable), method=lm, se=FALSE) +
facet_wrap(~variable, scales="free_x") +
labs(x = "Variables", y = "Svolumes")
但我可能误解了一些东西,因为我没有得到我想要的东西。 我对 R 很陌生,所以我希望有人能帮助我。
我有这个错误:
geom_smooth: Only one unique x value each group.Maybe you want aes(group = 1)?
geom_smooth: Only one unique x value each group.Maybe you want aes(group = 1)?
geom_smooth: Only one unique x value each group.Maybe you want aes(group = 1)?
geom_smooth: Only one unique x value each group.Maybe you want aes(group = 1)?
geom_smooth: Only one unique x value each group.Maybe you want aes(group = 1)?
geom_smooth: Only one unique x value each group.Maybe you want aes(group = 1)?
最后,您是否认为可以为不同的主题使用不同的颜色,而不是每个变量使用一种颜色? 我可以在每张图上添加回归线吗?
感谢您的帮助。
样本数据:
Date Time Subject Sscore Smean Svscore Sdispersion Svolume Sbuzz Last close Company name Date
1 2015-07-08 09:10:00 MMM -0.2280 0.2593 -0.2795 0.375 8 0.6026 155.430000000 3M Company 2015-07-08
2 2015-07-08 09:10:00 ACE -0.4415 0.3521 -0.0374 0.500 4 0.7200 104.460000000 ACE Limited 2015-07-08
3 2015-07-07 09:10:00 AES 1.9821 0.0233 1.1743 1.000 1 1.9445 13.200000000 The AES Corporation 2015-07-07
4 2015-07-04 09:10:00 AFL -2.9335 0.0035 -0.2975 1.000 1 0.8321 61.960000000 AFLAC Inc. 2015-07-04
5 2015-07-07 09:10:00 MMM 0.2977 0.2713 -0.7436 0.400 5 0.4895 155.080000000 3M Company 2015-07-07
6 2015-07-07 09:10:00 ACE -0.2331 0.3519 -0.1118 1.000 3 0.7196 103.330000000 ACE Limited 2015-07-07
7 2015-06-28 09:10:00 AES 1.8721 0.0609 1.9100 0.500 2 2.4319 13.460000000 The AES Corporation 2015-06-28
8 2015-07-03 09:10:00 AFL 0.6024 0.0330 -0.2663 1.000 1 0.6822 61.960000000 AFLAC Inc. 2015-07-03
9 2015-07-06 09:10:00 MMM -1.0057 0.2579 -1.3796 1.000 1 0.4531 155.380000000 3M Company 2015-07-06
10 2015-07-06 09:10:00 ACE -0.0263 0.3435 -0.1904 1.000 2 1.3536 103.740000000 ACE Limited 2015-07-06
11 2015-06-19 09:10:00 AES -1.1981 0.1517 1.2063 1.000 2 1.9427 13.850000000 The AES Corporation 2015-06-19
12 2015-07-02 09:10:00 AFL -0.8247 0.0269 1.8635 1.000 5 2.2454 62.430000000 AFLAC Inc. 2015-07-02
13 2015-07-05 09:10:00 MMM -0.4272 0.3107 -0.7970 0.167 6 0.6003 155.380000000 3M Company 2015-07-05
14 2015-07-04 09:10:00 ACE 0.0642 0.3274 -0.0975 0.667 3 1.2932 103.740000000 ACE Limited 2015-07-04
15 2015-06-17 09:10:00 AES 0.1627 0.1839 1.3141 0.500 2 1.9578 13.580000000 The AES Corporation 2015-06-17
16 2015-07-01 09:10:00 AFL -0.7419 0.0316 1.5699 0.250 4 2.0988 62.200000000 AFLAC Inc. 2015-07-01
17 2015-07-04 09:10:00 MMM -0.5962 0.3484 -1.2481 0.667 3 0.4496 155.380000000 3M Company 2015-07-04
18 2015-07-03 09:10:00 ACE 0.8527 0.3085 0.1944 0.833 6 1.3656 103.740000000 ACE Limited 2015-07-03
19 2015-06-15 09:10:00 AES 0.8145 0.1725 0.2939 1.000 1 1.6121 13.350000000 The AES Corporation 2015-06-15
20 2015-06-30 09:10:00 AFL 0.3076 0.0538 -0.0938 1.000 1 0.7071 61.440000000 AFLAC Inc. 2015-06-30
输入
data <- structure(list(`Date Time` = structure(c(1436361000, 1436361000,
1436274600, 1436015400, 1436274600, 1436274600, 1435497000, 1435929000,
1436188200, 1436188200, 1434719400, 1435842600, 1436101800, 1436015400,
1434546600, 1435756200, 1436015400, 1435929000, 1434373800, 1435669800
), class = c("POSIXct", "POSIXt"), tzone = ""), Subject = c("MMM",
"ACE", "AES", "AFL", "MMM", "ACE", "AES", "AFL", "MMM", "ACE",
"AES", "AFL", "MMM", "ACE", "AES", "AFL", "MMM", "ACE", "AES",
"AFL"), Sscore = c(-0.228, -0.4415, 1.9821, -2.9335, 0.2977,
-0.2331, 1.8721, 0.6024, -1.0057, -0.0263, -1.1981, -0.8247,
-0.4272, 0.0642, 0.1627, -0.7419, -0.5962, 0.8527, 0.8145, 0.3076
), Smean = c(0.2593, 0.3521, 0.0233, 0.0035, 0.2713, 0.3519,
0.0609, 0.033, 0.2579, 0.3435, 0.1517, 0.0269, 0.3107, 0.3274,
0.1839, 0.0316, 0.3484, 0.3085, 0.1725, 0.0538), Svscore = c(-0.2795,
-0.0374, 1.1743, -0.2975, -0.7436, -0.1118, 1.91, -0.2663, -1.3796,
-0.1904, 1.2063, 1.8635, -0.797, -0.0975, 1.3141, 1.5699, -1.2481,
0.1944, 0.2939, -0.0938), Sdispersion = c(0.375, 0.5, 1, 1, 0.4,
1, 0.5, 1, 1, 1, 1, 1, 0.167, 0.667, 0.5, 0.25, 0.667, 0.833,
1, 1), Svolume = c(8L, 4L, 1L, 1L, 5L, 3L, 2L, 1L, 1L, 2L, 2L,
5L, 6L, 3L, 2L, 4L, 3L, 6L, 1L, 1L), Sbuzz = c(0.6026, 0.72,
1.9445, 0.8321, 0.4895, 0.7196, 2.4319, 0.6822, 0.4531, 1.3536,
1.9427, 2.2454, 0.6003, 1.2932, 1.9578, 2.0988, 0.4496, 1.3656,
1.6121, 0.7071), `Last close` = c(155.43, 104.46, 13.2, 61.96,
155.08, 103.33, 13.46, 61.96, 155.38, 103.74, 13.85, 62.43, 155.38,
103.74, 13.58, 62.2, 155.38, 103.74, 13.35, 61.44), `Company name` = c("3M Company",
"ACE Limited", "The AES Corporation", "AFLAC Inc.", "3M Company",
"ACE Limited", "The AES Corporation", "AFLAC Inc.", "3M Company",
"ACE Limited", "The AES Corporation", "AFLAC Inc.", "3M Company",
"ACE Limited", "The AES Corporation", "AFLAC Inc.", "3M Company",
"ACE Limited", "The AES Corporation", "AFLAC Inc."), Date = structure(c(16624,
16624, 16623, 16620, 16623, 16623, 16614, 16619, 16622, 16622,
16605, 16618, 16621, 16620, 16603, 16617, 16620, 16619, 16601,
16616), class = "Date")), .Names = c("Date Time", "Subject",
"Sscore", "Smean", "Svscore", "Sdispersion", "Svolume", "Sbuzz",
"Last close", "Company name", "Date"), row.names = c("1", "2",
"3", "4", "5", "6", "7", "8", "9", "10", "11", "12", "13", "14",
"15", "16", "17", "18", "19", "20"), class = "data.frame")
【问题讨论】:
-
可以添加一些示例数据吗?
-
完成。我想出了如何只取我感兴趣的变量(编辑)。我现在只需要知道为什么我的图表上没有回归线以及如何在图表上或下方添加回归表达式?使用 lm(otherVariables ~ data$Svolume)。谢谢
-
我没有收到该错误,您的代码工作正常。对于第二个问题,您需要将主题添加到融化
data.3 = melt(data[, 2:9], id.vars = c('Subject','Svolume')),然后将colour=variable更改为主题
标签: r ggplot2 regression