使用 BeautifulSoup4 和 Python 3 解析 html 表答案

【问题标题】：Parse html table with BeautifulSoup4 and Python 3使用 BeautifulSoup4 和 Python 3 解析 html 表
【发布时间】：2016-02-22 20:51:57
【问题描述】：

我正在尝试从 Yahoo Finance 抓取某些财务数据。具体在这种情况下，单个收入数字（类型：双）

这是我的代码：

from urllib.request import urlopen
from bs4 import BeautifulSoup
  
searchurl = "http://finance.yahoo.com/q/ks?s=AAPL"
f = urlopen(searchurl)
html = f.read()
soup = BeautifulSoup(html, "html.parser")

revenue = soup.find("div", {"class": "yfnc_tabledata1", "id":"yui_3_9_1_8_1456172462911_38"})
print (revenue)

Chrome 的查看源代码检查如下所示：

我正在尝试抓取“234.99B”数字，去掉“B”，然后将其转换为小数。我的“soup.find”行有问题，我哪里出错了？

【问题讨论】：

标签： python html parsing beautifulsoup

【解决方案1】：

使用Revenue (ttm): 文本找到td 元素并获取next td sibling：

revenue = soup.find("td", text="Revenue (ttm):").find_next_sibling("td").text
print(revenue)

打印234.99B。

【讨论】：