【发布时间】:2020-12-25 15:48:58
【问题描述】:
我已经通过 BeautifulSoup 解析了我的字符串。
from bs4 import BeautifulSoup
import requests
import re
def otoMoto(link):
URL = link
page = requests.get(URL).content
bs = BeautifulSoup(page, 'html.parser')
for offer in bs.find_all('div', class_= "offer-item__content ds-details-container"):
# print(offer)
# print("znacznik")
linkOtoMoto = offer.find('a', class_="offer-title__link").get('href')
# title = offer.find("a")
titleOtoMoto = offer.find('a', class_="offer-title__link").get('title')
rokProdukcji = offer.find('li', class_="ds-param").get_text().strip()
rokPrzebPojemPali = offer.find_all('li',class_="ds-param")
print(linkOtoMoto+" "+titleOtoMoto+" "+rokProdukcji)
print(rokPrzebPojemPali)
break
URL = "https://www.otomoto.pl/osobowe/bmw/seria-3/od-2016/?search%5Bfilter_float_price%3Afrom%5D=50000&search%5Bfilter_float_price%3Ato%5D=65000&search%5Bfilter_float_year%3Ato%5D=2016&search%5Bfilter_float_mileage%3Ato%5D=100000&search%5Bfilter_enum_financial_option%5D=1&search%5Border%5D=filter_float_price%3Adesc&search%5Bbrand_program_id%5D%5B0%5D=&search%5Bcountry%5D="
otoMoto(URL)
结果:
https://www.otomoto.pl/oferta/bmw-seria-3-x-drive-nowe-opony-ID6Dr4JE.html#d51bf88c70 BMW Seria 3 2016
[<li class="ds-param" data-code="year">
<span>2016 </span>
</li>, <li class="ds-param" data-code="mileage">
<span>50 000 km</span>
</li>, <li class="ds-param" data-code="engine_capacity">
<span>1 998 cm3</span>
</li>, <li class="ds-param" data-code="fuel_type">
<span>Benzyna</span>
</li>]
所以我可以提取单个字符串,但如果我看到同一个类
class="ds-param"
例如,我无法将生产日期分配给变量。如果您有任何想法,请告诉我:)。
祝你有美好的一天!
【问题讨论】:
标签: html python-3.x web-scraping