【发布时间】:2019-07-15 21:52:10
【问题描述】:
我正在尝试使用 BeautifulSoup (Python 3.7) 选择块内的特定链接。如何在选定块中选择特定链接?
这是我目前正在做的工作,我以前使用过 selenium,但我认为还没有必要。
from bs4 import BeautifulSoup
import requests
base_url = 'http://www.shop.pr'
shop_urls = {'econo' : '/econo/shoppers' ,
'pueblo' : '/pueblo/shoppers' ,
'costco' : '/costco/shoppers' ,
'econo' : '/econo/shoppers'}
selected_shop = 'econo'
append_to_url = shop_urls.get(selected_shop)
url = base_url + append_to_url
page = requests.get(url)
soup = BeautifulSoup(page.text , 'html.parser')
toString = str(soup.prettify)
file = open('page.txt','w+')
file.write(toString)
wrapper = soup.find("div", {"class": "wrapper"})
sub_wrapper = wrapper.find('div' , {'class' : 'breadcrumb-holder' })
print(sub_wrapper)
在深入挖掘代码之后,我得到了这个:
<div class="breadcrumb-holder">
<div data-react-class="SliderPageLink" data-react-
props='{"baseLink":"/econo/shoppers/donde-mejor-se-compra-20190711/4878/product-list-view","page":1,"linkText":"VER PRODUCTOS","sliderSelector":"#shopper-terminal .catalog-view .slider","show":true,"back":false}'></div>
<ul class="breadcrumb">
<li>
<a href="/">Shoppers</a>
</li>
<li>
<a href="/econo/shoppers?clientid=1"><strong>Econo</strong>
</a></li>
</ul>
</div>
后来试图得到:
"/econo/shoppers/donde-mejor-se-compra-20190711/4878/product-list-view" 但它返回“无”。
【问题讨论】:
标签: python html beautifulsoup screen-scraping