【问题标题】:why is this find method returning None and throwing error?为什么这个 find 方法返回 None 并抛出错误?
【发布时间】:2023-03-11 22:00:01
【问题描述】:
import requests
from bs4 import BeautifulSoup
import re
import pandas as pd

def tableData(data, attrs):
    row = []
    
    data = data.find(attrs=attrs)
    tr = data.find_all('tr')
    header = [ th.get_text(strip=True) for th in data.find_all('th') ]
    if header:
        row.append(header)
    for tr in tr[1:]:
        row.append([ td.get_text(strip=True) for td in tr.find_all('td')])
    return row

url1 = 'https://www.nfl.com/standings/league/2019/REG'
page1 = requests.get(url1)
soup1 = BeautifulSoup(page1.text, 'lxml')
table = soup1.find('table', attrs={'summary': 'Standings - Detailed View'})
# print(table)
print(tableData(table, {'summary': 'Standings - Detailed View'}))

即使在调试器中,我也看到了包含 Standings - 详细视图的数据值,但是当 data.find(attrs=attrs) 运行时,它似乎返回 None

【问题讨论】:

    标签: python-3.x beautifulsoup python-requests request


    【解决方案1】:

    您已经拥有桌子:table。它有行和所有内容,但没有任何进一步的“排名 - 详细视图”属性。直接进入行:

    def tableData(data):#, attrs):
        row = []    
        #data = data.find(attrs=attrs)
        tr = data.find_all('tr')
    

    更好的是,使用 pandas(因为你导入它,无论如何)将表提取为数据框:

    df = pd.read_html('https://www.nfl.com/standings/league/2019/REG')[0]
    

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 2022-12-18
      • 2013-02-17
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 2012-05-04
      • 2011-01-02
      • 2019-09-01
      相关资源
      最近更新 更多