【发布时间】:2021-04-08 13:00:08
【问题描述】:
我在another question 上发布了如何使用“plotly”库在 R 中绘制交互式时间序列。我收到了答复并尝试运行代码 - 但是,此代码已运行了 3 个小时。数据不是很大,我在 plotly 网站(https://plotly.com/r/cumulative-animations/ 和https://plotly.com/r/custom-buttons/)上尝试了一个类似的例子,它们似乎运行良好。
这是我试图运行的代码(来自我之前的回答):
#load libraries and generate artificial time series data (this part works)
library(xts)
library(ggplot2)
library(dplyr)
library(plotly)
#create data
#time series 1
date_decision_made = seq(as.Date("2014/1/1"), as.Date("2016/1/1"),by="day")
date_decision_made <- format(as.Date(date_decision_made), "%Y/%m/%d")
property_damages_in_dollars <- rnorm(731,100,10)
final_data <- data.frame(date_decision_made, property_damages_in_dollars)
final_data %>%
mutate(date_decision_made = as.Date(date_decision_made)) %>%
add_count(week = format(date_decision_made, "%W-%y"))
final_data$class = "time_series_1"
#time series 2
date_decision_made = seq(as.Date("2014/1/1"), as.Date("2016/1/1"),by="day")
date_decision_made <- format(as.Date(date_decision_made), "%Y/%m/%d")
property_damages_in_dollars <- rnorm(731,10,10)
final_data_2 <- data.frame(date_decision_made, property_damages_in_dollars)
final_data_2 %>%
mutate(date_decision_made = as.Date(date_decision_made)) %>%
add_count(week = format(date_decision_made, "%W-%y"))
final_data_2$class = "time_series_2"
#combine
data = rbind(final_data, final_data_2)
第 1 部分:
#part 1:
data <- data %>%
mutate(tmp_date = as.numeric(as.Date(date_decision_made, format = "%Y/%m/%d")))
accumulate_by <- function(dat, var) {
var <- lazyeval::f_eval(var, dat)
lvls <- plotly:::getLevels(var)
dats <- lapply(seq_along(lvls), function(x) {
cbind(dat[var %in% lvls[seq(1, x)], ], frame = lvls[[x]])
})
dplyr::bind_rows(dats)
}
data <- data %>% accumulate_by(~tmp_date)
fig <- data %>%
plot_ly(
x = ~tmp_date,
y = ~property_damages_in_dollars,
split = ~class,
frame = ~frame,
type = 'scatter',
mode = 'lines',
line = list(simplyfy = F)
)
fig
第 2 部分:
#part 2
updatemenus <- list(
list(
active = -1,
type= 'buttons',
buttons = list(
list(
label = "time_series_1",
method = "update",
args = list(list(visible = c(FALSE, TRUE)),
list(title = "series 1",
annotations = list(c(), high_annotations)))),
list(
label = "time_series_2",
method = "update",
args = list(list(visible = c(TRUE, FALSE)),
list(title = "series 2",
annotations = list(low_annotations, c() )))),
)
)
)
fig <- data %>% plot_ly(type = 'scatter', mode = 'lines')
fig <- fig %>% add_lines(x=~date_decision_made,
y=~property_damages_in_dollars, name="High",
line=list(color="#33CFA5"))
fig <- fig %>% add_lines(x=~date_decision_made,
y=~property_damage_in_dollars, name="Low",
line=list(color="#F06A6A"))
fig <- fig %>% layout(title = "Apple", showlegend=FALSE,
xaxis=list(title="Date"),
yaxis=list(title="Price ($)"),
updatemenus=updatemenus)
fig
谁能告诉我我做错了什么?还是我的电脑和 R 控制台有问题?数据不是很大,过去我对类似大小的数据运行过类似的程序,没有太多麻烦。
谢谢
注意:会话信息
sessionInfo()
R version 4.0.2 (2020-06-22)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 18363)
Matrix products: default
locale:
[1] LC_COLLATE=English_Canada.1252 LC_CTYPE=English_Canada.1252 LC_MONETARY=English_Canada.1252 LC_NUMERIC=C
[5] LC_TIME=English_Canada.1252
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] plotly_4.9.2.1 dplyr_1.0.2 ggplot2_3.3.2 xts_0.12.1 zoo_1.8-8
loaded via a namespace (and not attached):
[1] tinytex_0.26 tidyselect_1.1.0 xfun_0.15 purrr_0.3.4 reshape2_1.4.4 splines_4.0.2
[7] lattice_0.20-41 colorspace_1.4-1 vctrs_0.3.2 generics_0.0.2 viridisLite_0.3.0 htmltools_0.5.0
[13] stats4_4.0.2 yaml_2.2.1 survival_3.2-7 prodlim_2019.11.13 rlang_0.4.7 ModelMetrics_1.2.2.2
[19] pillar_1.4.6 glue_1.4.1 withr_2.3.0 xgboost_1.1.1.1 foreach_1.5.1 lifecycle_0.2.0
[25] plyr_1.8.6 lava_1.6.8 stringr_1.4.0 timeDate_3043.102 munsell_0.5.0 gtable_0.3.0
[31] recipes_0.1.13 htmlwidgets_1.5.2 codetools_0.2-16 crosstalk_1.1.0.1 caret_6.0-86 class_7.3-17
[37] Rcpp_1.0.5 scales_1.1.1 ipred_0.9-9 jsonlite_1.7.1 digest_0.6.25 stringi_1.4.6
[43] grid_4.0.2 tools_4.0.2 magrittr_1.5 lazyeval_0.2.2 tibble_3.0.3 tidyr_1.1.0
[49] crayon_1.3.4 pkgconfig_2.0.3 MASS_7.3-53 ellipsis_0.3.1 Matrix_1.2-18 data.table_1.12.8
[55] pROC_1.16.2 lubridate_1.7.9 gower_0.2.2 httr_1.4.2 rstudioapi_0.11 iterators_1.0.13
[61] R6_2.4.1 rpart_4.1-15 nnet_7.3-14 nlme_3.1-149 compiler_4.0.2
【问题讨论】:
-
您是否尝试过隔离代码的哪一部分花费了这么长时间?我个人会停止它,创建数据(并发现是否是减速),然后逐步尝试找出问题。
-
@Elin:谢谢你的回复。是的,这两个部分都是 plot_ly 函数。他们也需要很长时间吗?
-
这段代码的某些方面对我来说没有意义(多次计算相同的东西,一遍又一遍地对相同的数据使用 as.Date()。不确定是否有必要也转换为数字。您确定不是 cumulate 函数是问题吗?github.com/ropensci/plotly/issues/957 也尝试切换到使用原始日期而不是 tmp_date 的 x..
-
仅看这些图表,我确实认为您有很多数据。如果您只是尝试绘制前 4 个月或类似的图表会怎样?
标签: r time time-series plotly data-visualization