【发布时间】:2021-06-02 13:24:14
【问题描述】:
我正在抓取this website。我对提取在最后一个脚本节点script node snippet 中找到的内容特别感兴趣。到目前为止,我尝试了以下方法:
url <- "https://insolvencyinsider.ca/filing/"
ii <- read__html(url)
fwp <- ii %>%
htl_nodes("body") %>%
xml_find_first(xpath = "/script[15]") %>%
html_text() # Not text so I wouldn't expect this to work.
#> character (empty)
fwp <- ii %>%
htl_nodes("body") %>%
xml_find_first(xpath = "/script[15]") %>%
html_attr("window.FWP_JSON") # Don't think this makes sense since its not an attribute?
#> chr NA
【问题讨论】:
标签: r web-scraping rvest