【发布时间】:2020-10-19 16:34:57
【问题描述】:
我偶尔在 R 中使用时间序列进行数据分析,但我不熟悉使用 ARIMA 等函数进行绘图。
以下问题源于对美国每日 COVID 病例数以立方计算的评论。确实看起来像这样,我想简单地运行三次回归,其目的是在散点图上绘制多项式曲线。由于这是一个时间序列,我认为使用 lm() 函数是行不通的。
代码如下:
options(repr.plot.width=14, repr.plot.height=10)
install.packages('RCurl')
require(repr) # Enables resizing of the plots.
require(RCurl)
require(foreign)
require(tidyverse) # To tip the df from long row of dates to cols (pivot_longer())
# Extracting the number of confirmed cummulative cases by country from the Johns Hopkins website:
x = getURL("https://raw.githubusercontent.com/CSSEGISandData/COVID-19/master/csse_covid_19_data/csse_covid_19_time_series/time_series_covid19_confirmed_global.csv")
corona <- read.csv(textConnection(x))
corona = (read_csv(x)
%>% pivot_longer(cols = -c(`Province/State`, `Country/Region`, Lat, Long),
names_to = "date",
values_to = "cases")
%>% select(`Province/State`,`Country/Region`, date, cases)
%>% mutate(date=as.Date(date,format="%m/%d/%y"))
%>% drop_na(cases)
%>% rename(country="Country/Region", provinces="Province/State")
)
cc <- (corona
%>% filter(country %in% c("US"))
)
ccw <- (cc
%>% pivot_wider(names_from="country",values_from="cases")
%>% filter(US>5)
)
first.der<-diff(ccw$US, lag = 1, differences = 1)
plot(ccw$date[2:length(ccw$date)-1], first.der,
pch = 19, cex = 1.2,
ylab='',
xlab='',
main ='Daily COVID-19 cases in US',
col="firebrick",
axes=FALSE,
cex.main=1.5)
abline(h=0)
abline(v=ccw$date[length(ccw$date)-1], col='gray90')
abline(h=first.der[length(ccw$date)-1], col='firebrick', lty=2, lwd=.5)
at1 <- seq(min(ccw$date), max(ccw$date), by=2);
axis.Date(1, at=at1, format="%b %d", las=2, cex.axis=0.7)
axis(side=2, seq(min(first.der),max(first.der),1000),
las=2, cex.axis=1)
【问题讨论】:
标签: r plot time-series regression