【发布时间】:2020-02-23 16:59:27
【问题描述】:
我假装能够获得用户在 Google Play 上留下的关于这些应用的所有评论。我有这个代码,他们在那里指出 Web scraping in R through Google playstore 。但问题是您只能获得前 40 条评论。有没有可能拿到app的所有cmet?
```
#Loading the rvest package
library(rvest)
library(magrittr) # for the '%>%' pipe symbols
library(RSelenium) # to get the loaded html of
#Specifying the url for desired website to be scraped
url <- 'https://play.google.com/store/apps/details?
id=com.phonegap.rxpal&hl=en_IN&showAllReviews=true'
# starting local RSelenium (this is the only way to start RSelenium that
is working for me atm)
selCommand <- wdman::selenium(jvmargs = c("-
Dwebdriver.chrome.verboseLogging=true"), retcommand = TRUE)
shell(selCommand, wait = FALSE, minimized = TRUE)
remDr <- remoteDriver(port = 4567L, browserName = "firefox")
remDr$open()
# go to website
remDr$navigate(url)
# get page source and save it as an html object with rvest
html_obj <- remDr$getPageSource(header = TRUE)[[1]] %>% read_html()
# 1) name field (assuming that with 'name' you refer to the name of the
reviewer)
names <- html_obj %>% html_nodes(".kx8XBd .X43Kjb") %>% html_text()
# 2) How much star they got
stars <- html_obj %>% html_nodes(".kx8XBd .nt2C1d [role='img']") %>%
html_attr("aria-label")
# 3) review they wrote
reviews <- html_obj %>% html_nodes(".UD7Dzf") %>% html_text()
# create the df with all the info
review_data <- data.frame(names = names, stars = stars, reviews = reviews,
stringsAsFactors = F)
```
【问题讨论】:
标签: r selenium web-scraping google-play