【问题标题】:NoneType Object has no attribute “get_text” — PythonNoneType 对象没有属性“get_text”——Python
【发布时间】:2020-10-22 09:01:10
【问题描述】:

我正在从亚马逊进行一些网页抓取,我遇到了这个错误(在标题中提到)。

这是我的代码:

import requests
from bs4 import BeautifulSoup
import smtplib


URL = 'https://www.amazon.co.uk/UGREEN-Adapter-Samsung-Oneplus- Blackview/dp/B072V9CNTK/ref=sr_1_2_sspa?keywords=otg+cable&qid=1578610622&sr=8-2-spons&psc=1&spLa=ZW5jcnlwdGVkUXVhbGlmaWVyPUEzRzRRUUdaR05RVlRJJmVuY3J5cHRlZElkPUEwNjExNjM4MVI4NVZaTFlYTlhGSCZlbmNyeXB0ZWRBZElkPUEwMjg1MTU0OEhROERWQTBSRFAzJndpZGdldE5hbWU9c3BfYXRmJmFjdGlvbj1jbGlja1JlZGlyZWN0JmRvTm90TG9nQ2xpY2s9dHJ1ZQ==' 

headers = {
"User Agent": 'Mozilla/5.0 (Windows NT 10.0; Win64; x64 AppleWebKit/537.36 (KHTML, like Gecko) Chrome/79.0.3945.117 Safari/537.36'}

page = requests.get(URL, headers=headers)
soup = BeautifulSoup(page.content, 'html.parser')
title = soup.find(id="productTitle").get_text()
price = soup.find(id="priceblock_ourprice").get_text()
converted_price = float(price[0:3])



def check_price():
    print(soup.find(id="priceblock_ourprice").get_text())
    converted_price = float(price[0:3])
    if(converted_price < 7.00):
        send_mail()

【问题讨论】:

  • 你能试试print(soup.find(id="priceblock_ourprice").get_text())吗?

标签: python beautifulsoup nonetype


【解决方案1】:

这是因为页面是使用 javascript 动态加载的。你可以使用 selenium 来获取网站的 html 代码,像这样:

from selenium import webdriver

URL = 'https://www.amazon.co.uk/UGREEN-Adapter-Samsung-Oneplus- Blackview/dp/B072V9CNTK/ref=sr_1_2_sspa?keywords=otg+cable&qid=1578610622&sr=8-2-spons&psc=1&spLa=ZW5jcnlwdGVkUXVhbGlmaWVyPUEzRzRRUUdaR05RVlRJJmVuY3J5cHRlZElkPUEwNjExNjM4MVI4NVZaTFlYTlhGSCZlbmNyeXB0ZWRBZElkPUEwMjg1MTU0OEhROERWQTBSRFAzJndpZGdldE5hbWU9c3BfYXRmJmFjdGlvbj1jbGlja1JlZGlyZWN0JmRvTm90TG9nQ2xpY2s9dHJ1ZQ=='

driver = webdriver.Chrome()

driver.get(URL)

time.sleep(5)

page = driver.page_source

driver.close()

因此,这里是完整的代码:

from bs4 import BeautifulSoup
from selenium import webdriver
import time

URL = 'https://www.amazon.co.uk/UGREEN-Adapter-Samsung-Oneplus- Blackview/dp/B072V9CNTK/ref=sr_1_2_sspa?keywords=otg+cable&qid=1578610622&sr=8-2-spons&psc=1&spLa=ZW5jcnlwdGVkUXVhbGlmaWVyPUEzRzRRUUdaR05RVlRJJmVuY3J5cHRlZElkPUEwNjExNjM4MVI4NVZaTFlYTlhGSCZlbmNyeXB0ZWRBZElkPUEwMjg1MTU0OEhROERWQTBSRFAzJndpZGdldE5hbWU9c3BfYXRmJmFjdGlvbj1jbGlja1JlZGlyZWN0JmRvTm90TG9nQ2xpY2s9dHJ1ZQ=='

driver = webdriver.Chrome()

driver.get(URL)

time.sleep(5)

page = driver.page_source

driver.close()

soup = BeautifulSoup(page, 'html5lib')
title = soup.find(id="productTitle")
price = soup.find(id="priceblock_ourprice")

print(soup.find(id="priceblock_ourprice").get_text())

输出:

£6.99

【讨论】:

猜你喜欢
  • 1970-01-01
  • 2017-12-28
  • 1970-01-01
  • 2015-04-07
  • 1970-01-01
  • 1970-01-01
  • 2020-05-25
  • 1970-01-01
  • 1970-01-01
相关资源
最近更新 更多