Python Splinter 星级评分答案

【问题标题】：Python Splinter Star RatingsPython Splinter 星级评分
【发布时间】：2018-06-21 13:51:10
【问题描述】：

鉴于“最近评论”部分here下的星级，

我正在尝试为页面上显示的每条评论建立一个星级列表。麻烦的是每个星级评级对象没有一个值。例如，我可以像这样通过 xpath 获取单个星形对象：

from splinter import Browser

url = 'https://www.greatschools.org/texas/harker-heights/3978-Harker-Heights-Elementary-School/'
browser.visit(url)

astar=browser.find_by_xpath('/html/body/div[5]/div[4]/div[2]/div[11]/div/div/div[2]/div/div/div[2]/div/div[2]/div[3]/div/div[2]/div[1]/div[2]/span/span[1]')

问题是我似乎无法访问对象 astar 的值（无论是否填写）。

这是 HTML：

<div class="answer">
 <span class="five-stars">
  <span class="icon-star filled-star"></span>
  <span class="icon-star filled-star"></span>
  <span class="icon-star filled-star"></span>
  <span class="icon-star filled-star"></span>
  <span class="icon-star filled-star"></span>
 </span>
</div>

更新： 有些 cmets 根本没有星级，所以我需要能够确定某个特定评论是否有星级，如果有，评分是多少。 This 似乎至少有助于获得所有明星的名单。我用它来做到这一点：

stars = browser.find_by_css('span[class="icon-star filled-star"]')

所以，如果我能得到一个列表，显示评论是否有星级（类似于评级 = [1,0,1,1...]）和所有星级的序列（即 ['Filled ', 'Filled', 'Empty'...])，我想我可以拼凑出这个序列。

【问题讨论】：

标签： python-3.x splinter

【解决方案1】：

一个解决方案：像这样访问每个对象的 html 属性：

#Get total number of comments
allcoms = len(browser.find_by_text('Overall experience'))

#Loop through all comments and gather into list
comments = []
#If pop-up box occurs, use div[4] instead of second div[5]
if browser.is_element_present_by_xpath('/html/body/div[5]/div[4]/div[2]/div[11]/div/div/div[2]/div/div/div[2]/div/div[2]/div[1]/div/div[2]'):
    use='4'
else:
    use='5'

for n in range(allcoms): #sometimes the second div[5] was div[4]
    comments.append(browser.find_by_xpath('/html/body/div[5]/div['+use+']/div[2]/div[11]/div/div/div[2]/div/div/div[2]/div/div[2]/div['+str(n+1)+']/div/div[2]').value)

#Get all corresponding star ratings
#https://stackoverflow.com/questions/46468030/how-select-class-div-tag-in-splinter
ratingcode = []
ratings = browser.find_by_css('span[class="five-stars"]')
for a in range(len(comments)+2): #Add 2 to skip over first 2 ratings
    if a<2: #skip first 2 and last 3 because these are other ratings - by just using range(len(comments)) above to get correct # before stopping
        pass
    else:
        ratingcode.append(ratings[a].html)

【讨论】：