我认为这不适用于rvest,因为内容是动态的而不是静态的。当源被读入 R 时,table 元素不会被加载。我可以使用基于this tutorial 的RSelenium 来执行此操作,但请注意,您至少必须先安装 phantomJS。
library(RSelenium)
library(tidyverse)
driver <- rsDriver(browser="firefox", phantomver="2.0.0")
remote_driver <- driver[["client"]]
remote_driver$open()
remote_driver$navigate("https://coinmarketcap.com/currencies/bitcoin/historical-data/")
tab <- remote_driver$findElement(using="class", value="cmc-table")
tab_txt <- tab$getElementText()[[1]]
mytab <- read_delim(tab_txt, delim=" ", col_names=FALSE, skip=1)
mytab$X1 <- with(mytab, paste(X1, X2, X3, sep=" "))
mytab <- mytab %>% select(-c(X2,X3))
names(mytab) <- c("Date", "Open", "High", "Low", "Close", "Volume", "Market Cap")
head(mytab)
# # A tibble: 6 x 7
# Date Open High Low Close Volume `Market Cap`
# <chr> <chr> <chr> <chr> <chr> <chr> <chr>
# 1 Aug 23, 2021 $49,291.68 $50,482.08 $49,074.… $49,546.… $34,305,053,7… $931,244,272,4…
# 2 Aug 22, 2021 $48,869.10 $49,471.61 $48,199.… $49,321.… $25,370,975,3… $926,961,622,3…
# 3 Aug 21, 2021 $49,327.07 $49,717.02 $48,312.… $48,905.… $40,585,205,3… $919,092,181,7…
# 4 Aug 20, 2021 $46,723.12 $49,342.15 $46,650.… $49,339.… $34,706,867,4… $927,189,789,0…
# 5 Aug 19, 2021 $44,741.88 $46,970.76 $43,998.… $46,717.… $37,204,312,2… $877,875,534,8…
# 6 Aug 18, 2021 $44,686.75 $45,952.06 $44,364.… $44,801.… $32,194,123,0… $841,823,296,2…
您可能希望能够以编程方式点击“加载更多”按钮。我能够像这样访问按钮。
button_element <- remote_driver$findElement(using = 'class', value = "x0o17e-0")
虽然我不知道这个类名是固定的还是因会话而异。另外,当我这样做时:
replicate(25, button_element$clickElement())
应该点击按钮 25 次,它只是弹出一个对话框,要求我登录。您可以手动点击由 RSelenium 驱动的网站上的按钮(您应该有一个浏览器,该浏览器有一个由 R 驱动的红色条纹地址栏。当我点击该按钮几次,然后执行代码在表中读取,新表有更多行(即,它已响应按下加载更多按钮)。