【发布时间】:2020-02-15 05:55:07
【问题描述】:
我正在尝试使用 R 抓取数据以获取有关以下网站中某些列表的详细信息,但我收到一个错误,我不确定如何解决:open.connection(x, "rb") 中的错误: HTTP 错误 404 我尝试使用 httr 包并尝试在类似帖子中看到的功能,但无法解决它。我是不是做错了什么?
library(XML)
library(RCurl)
library(curl)
library(rvest)
library(tidyverse)
library(dplyr)
library(httr)
url <- "https://www.sgcarmart.com/new_cars/index.php"
cardetails <- read_html(url)
listing <- html_nodes(cardetails, "#nc_popular_car")
popularcars <- html_nodes(listing,".link")
count<-length(popularcars)
info <- data.frame(CarName=NA, Distributer=NA, Hotline= NA, CountryBuilt= NA, Predecessor= NA, stringsAsFactors = F )
for(i in 1:count)
{
h <- popularcars[[i]]
details_url <- paste0("https://www.sgcarmart.com/new_cars",html_attr(h,"href"))
details <- read_html(details_url)
info[i,]$CarName <- html_node(details,".link_redbanner")
}
info
【问题讨论】: