如何使用beautifulsoup获取另一个标签内的span标签内的文本？答案

【问题标题】：How to get the text inside a span tag which is inside another tag using beautifulsoup?如何使用beautifulsoup获取另一个标签内的span标签内的文本？
【发布时间】：2018-08-15 12:41:22
【问题描述】：

如何获取所有具有 class="no-wrap text-right circular-supply" 的标签的值？我用的是：

text=[ ]

text=(soup.find_all(class_="no-wrap text-right circulating-supply"))

text[0]的输出：

'\n\n17,210,662\nBTC\n'

我只想提取数值。

一个实例的示例：

<td class="no-wrap text-right circulating-supply" data-sort="17210662.0">
            <span data-supply="17210662.0">
             <span data-supply-container="">
              17,210,662
             </span>
             <span class="hidden-xs">
              BTC
             </span>
            </span>
           </td>

谢谢。

【问题讨论】：

标签： html python-3.x web-scraping beautifulsoup

【解决方案1】：

如果所有元素都具有相似的 HTML 结构，请尝试以下获取所需的输出：

texts = [node.text.strip().split('\n')[0] for node in soup.find_all(class_="no-wrap text-right circulating-supply")]

【讨论】：

我收到此错误 TypeError: 'NoneType' object is not callable
您确定text[0] 会返回文本吗？我猜soup.find_all() 应该返回类似 WebElement 对象的东西。检查更新的答案
你说得对，我忘了添加 .text 这行得通。非常感谢！

【解决方案2】：

这看起来有点矫枉过正，你可以使用正则表达式来提取数字

from bs4 import BeautifulSoup
html = """<td class="no-wrap text-right circulating-supply" data-sort="17210662.0">
            <span data-supply="17210662.0">
            <span data-supply-container="">
            17,210,662
            </span>
            <span class="hidden-xs">
            BTC
            </span>
            </span>
        </td>"""
import re
soup = BeautifulSoup(html,'html.parser')
coin_value =  [re.findall('(\d+)', node.text.replace(',','')) for node in soup.find_all(class_="no-wrap text-right circulating-supply")]
print coin_value

打印

[[u'17210662']]

【讨论】：