【发布时间】:2016-10-16 11:29:00
【问题描述】:
请不要投票,这个问题与上一个问题不同,我在这里使用不同的逻辑
我试图从这个页面https://www.tripadvisor.com/Airline_Review-d8729164-Reviews-Cheap-Flights-or560-TAP-Portugal#REVIEWS迭代所有用户评论(“partial_entry”类)
如果有非英文评论,那么我想打印它的英文翻译版本。否则,如果评论已经是英文,我想自己打印英文。但是它的代码跳过了这些 cmets(不打印它们)。您还可以在输出中看到 cmets 被打印了两次。
此页面上有 10 条评论/cmets(已翻译+未翻译),应该全部打印出来。
from selenium import webdriver
from selenium.webdriver.common.by import By
import time
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
driver = webdriver.Chrome()
driver.maximize_window()
url="https://www.tripadvisor.com/Airline_Review-d8729164-Reviews-Cheap-Flights-or560-TAP-Portugal#REVIEWS"
driver.get(url)
ctr=0
def expand_reviews(driver):
# TRYING TO EXPAND REVIEWS (& CLOSE A POPUP)
try:
driver.find_element_by_class_name("moreLink").click()
except:
print "err"
try:
driver.find_element_by_class_name("ui_close_x").click()
except:
print "err2"
try:
driver.find_element_by_class_name("moreLink").click()
except:
print "err3"
# FIRST EXPAND THE REVIEWS BY CLICKING "MORE" BUTTON
expand_reviews(driver)
for j in driver.find_elements_by_xpath("//div[@class='wrap']"): # FIND ALL REVIEW ELEMENTS
for ent in j.find_elements_by_xpath('.//p[@class="partial_entry"]'): # FIND REVIEW TEXT
# FIRST CHECK IF TRANSLATION IS AVAILABLE (I.E. NON ENGLISH COMMENTS)
if j.find_elements_by_css_selector('#REVIEWS .googleTranslation>.link'):
#print 'NOW PRINTING TRANSLATED COMMENTS'
gt= driver.find_elements(By.CSS_SELECTOR,"#REVIEWS .googleTranslation>.link")
size=len(gt)
while (ctr<size):
for i in gt:
try:
if not i.is_displayed():
continue
driver.execute_script("arguments[0].click()",i)
wait = WebDriverWait(driver, 10)
wait.until(EC.element_to_be_clickable((By.XPATH, ".//span[@class = 'ui_overlay ui_modal ']//div[@class='entry']")))
com= driver.find_element_by_xpath(".//span[@class = 'ui_overlay ui_modal ']//div[@class='entry']")
print com.text
print "++" * 60
time.sleep(5)
driver.find_element_by_class_name("ui_close_x").click()
time.sleep(5)
#loop+=1
except Exception as e:
print "skipped"
pass
ctr+=1
# COMMENT ALREADY IN ENGLISH, PRINT AS IT IS
else:
print ent
print "="*60
driver.quit()
==================================输出============= =============
<selenium.webdriver.remote.webelement.WebElement (session="15b6c83088a289e59c544a2c7787d27d", element="0.40753995907133644-28")>
============================================================
<selenium.webdriver.remote.webelement.WebElement (session="15b6c83088a289e59c544a2c7787d27d", element="0.40753995907133644-29")>
============================================================
<selenium.webdriver.remote.webelement.WebElement (session="15b6c83088a289e59c544a2c7787d27d", element="0.40753995907133644-30")>
============================================================
<selenium.webdriver.remote.webelement.WebElement (session="15b6c83088a289e59c544a2c7787d27d", element="0.40753995907133644-31")>
============================================================
<selenium.webdriver.remote.webelement.WebElement (session="15b6c83088a289e59c544a2c7787d27d", element="0.40753995907133644-32")>
============================================================
On my change my flight without asking my opinion or offer another solution without paying extra I stay more than 10 hours in boarding of room I have the urge to have something to eat I haven not even able to rest after my flight c is inadmissible night I no longer would resume this company and would not advise a person to take
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
A little apprehensive before but quickly lifted. Very welcome and good service from the PNC, hot meal and good even for this short flight (1h50). Good punctuality and boarding more efficient
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Everything normal. Aircraft clean and almost full. Embarking on time, regular. Arrive slightly earlier. friendly and courteous staff. On board it was given a snack.
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
In the recent past I have traveled a few times from Venice to Lisbon and from Venice to Oporto via Lisbon. Good facilities on land and aboard; friendly service, clean air, punctuality and competitive rates. recommended
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Sympathy and competence. The company strives to make passengers as comfortable as possible.
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
On my change my flight without asking my opinion or offer another solution without paying extra I stay more than 10 hours in boarding of room I have the urge to have something to eat I haven not even able to rest after my flight c is inadmissible night I no longer would resume this company and would not advise a person to take
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
A little apprehensive before but quickly lifted. Very welcome and good service from the PNC, hot meal and good even for this short flight (1h50). Good punctuality and boarding more efficient
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Everything normal. Aircraft clean and almost full. Embarking on time, regular. Arrive slightly earlier. friendly and courteous staff. On board it was given a snack.
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
【问题讨论】:
-
我在这个页面上看到所有需要翻译的 cmets,你能分享一个我们有英语和非英语 cmets 的页面
-
当你点击“更多”展开文本时,文本包含在“div”内的“p”中,类为“entry”......
-
@thebadguy 此页面有前 5 个英语 cmets,其余 5 个葡萄牙语 tripadvisor.com/…
-
@shalini ...您的代码在我的机器上运行良好...它打印第一个英文评论..而不是翻译一个。
-
我赞成你的问题
标签: python selenium web-scraping