用美汤提取网页内容答案

【问题标题】：Extracting web-page content with beautiful soup用美汤提取网页内容
【发布时间】：2019-06-05 02:17:10
【问题描述】：

我正在尝试使用 python 3 的漂亮汤从电子商务网站上刮下商品的价格。我还需要做什么才能从我的第一次拉取中提取价格？

我尝试过其他的代码组合，但对这种方法的理解不是很深。

import requests
from bs4 import BeautifulSoup

page = requests.get('https://www.walmart.com/ip/GoGreen-Power-6-Outlet-Surge-Protector-16103MS-2-5-cord-White/46097919')
soup = BeautifulSoup(page.text, 'html.parser')

price_hide = soup.find(class_='price-characteristic')
print(price_hide)

wprice = price_hide.find_all(content)
print(wprice)

第一个打印功能起作用

<span class="price-characteristic" content="3.98" itemprop="price">3</span>

第二个没那么多。

我希望打印 3.98 的内容价格

【问题讨论】：

price_hide['content'] 或 price_hide.get('content')
Extracting an attribute value with beautifulsoup的可能重复

标签： html python-3.x web-scraping beautifulsoup

【解决方案1】：

比你想象的要简单，试试下面的代码：

from bs4 import BeautifulSoup
import requests

page = requests.get('https://www.walmart.com/ip/GoGreen-Power-6-Outlet-Surge-Protector-16103MS-2-5-cord-White/46097919')
soup = BeautifulSoup(page.text, 'html.parser')

price_hide = soup.find(class_='price-characteristic')
print(price_hide)

wprice = price_hide["content"]
print(wprice)

我希望这会有所帮助！

【讨论】：

这个答案很棒，该技术适用于我尝试过的所有内容，直到<div class="col-xs-12 col-sm-7"> <div class="col-xs-12 col-sm-12"> <div class="row gray-bkg marg-top10"> <div class="col-xs-7 col-sm-3 price-lg">$6.92 <span class="unit"> / ea </span></div>下面的这个html代码我如何获得6.92？