【问题标题】:BeautifulSoup - AttributeError: 'NavigableString' object has no attribute 'find_all'BeautifulSoup - AttributeError:“NavigableString”对象没有属性“find_all”
【发布时间】:2018-08-25 19:53:02
【问题描述】:

试图让这个脚本遍历 html 文件并打印出所需的结果。它一直给我这个错误。它只适用于表中的一个“游戏”,但如果它不止一个,它就会中断。正在尝试修复它,以便它可以迭代多个游戏/停车票,但因此无法继续。

Traceback (most recent call last):
  File "C:/Users/desktop/Desktop/tabletest.py", line 11, in <module>
    for rows in table.find_all('tr'):
  File "C:\Program Files\Python36\lib\site-packages\bs4\element.py", line 737, in __getattr__
    self.__class__.__name__, attr))
AttributeError: 'NavigableString' object has no attribute 'find_all'

这是我的代码:

import pandas as pd
from bs4 import BeautifulSoup
import requests
import lxml.html as lh


with open("htmltabletest.html", encoding="utf-8") as f:
    data = f.read()
    soup = BeautifulSoup(data, 'lxml')
    for table in soup.find('table', attrs={'id': 'eventSearchTable'}):
        for rows in table.find_all('tr'):
            cols = table.find_all('td')

            empty = cols[0].get_text()
            eventdate = cols[1].get_text()
            eventname = cols[2].get_text()
            tickslisted = cols[3].get_text()
            pricerange = cols[4].get_text()

            entry = (empty, eventdate, eventname, tickslisted, pricerange)

            print(entry)

这是 html 文件中的内容:

<table class="dataTable st-alternateRows" id="eventSearchTable">
<thead>
<tr>
<th id="th-es-rb"><div class="dt-th"> </div></th>
<th id="th-es-ed"><div class="dt-th"><span class="th-divider"> </span>Event date<br/>Time (local)</div></th>
<th id="th-es-en"><div class="dt-th"><span class="th-divider"> </span>Event name<br/>Venue</div></th>
<th id="th-es-ti"><div class="dt-th"><span class="th-divider"> </span>Tickets<br/>listed</div></th>
<th id="th-es-pr"><div class="dt-th es-lastCell"><span class="th-divider"> </span>Price<br/>range</div></th>
</tr>
</thead>
<tbody class="" id="eventSearchTbody"><tr class="even" id="r-se-103577924">
<td class="nowrap"><input class="es-selectedEvent" id="se-103577924-check" name="selectEvent" type="radio"/></td>
<td class="nowrap" id="se-103577924-eventDateTime">Thu, 10/11/2018<br/>8:20 p.m.</td>
<td><div><a class="ellip" href="services/priceanalysis?eventId=103577924&amp;sectionId=0" id="se-103577924-eventName" target="_blank">Philadelphia Eagles at New York Giants</a></div><div id="se-103577924-venue">MetLife Stadium, East Rutherford, NJ</div></td>
<td id="se-103577924-nrTickets">6655</td>
<td class="es-lastCell nowrap" id="se-103577924-priceRange"><span id="se-103577924-minPrice">$134.50</span>  to<br/><span id="se-103577924-maxPrice">$2,222.50</span></td>
</tr><tr class="odd" id="r-se-103577925">
<td class="nowrap"><input class="es-selectedEvent" id="se-103577925-check" name="selectEvent" type="radio"/></td>
<td class="nowrap" id="se-103577925-eventDateTime">Thu, 10/11/2018<br/>8:21 p.m.</td>
<td><div><a class="ellip" href="services/priceanalysis?eventId=103577925&amp;sectionId=0" id="se-103577925-eventName" target="_blank">PARKING PASSES ONLY Philadelphia Eagles at New York Giants</a></div><div id="se-103577925-venue">MetLife Stadium Parking Lots, East Rutherford, NJ</div></td>
<td id="se-103577925-nrTickets">929</td>
<td class="es-lastCell nowrap" id="se-103577925-priceRange"><span id="se-103577925-minPrice">$20.39</span>  to<br/><span id="se-103577925-maxPrice">$3,602.50</span></td>
</tr></tbody>
</table>

【问题讨论】:

    标签: python python-3.x beautifulsoup


    【解决方案1】:

    错误在于您在表上迭代的方式,更具体地说是在行:

    for table in soup.find('table', attrs={'id': 'eventSearchTable'}):
    

    如果你想迭代,你应该使用find_all。事实上,如果你看一下这两种方法返回的值的类型:

    print(type(soup.find('table', attrs={'id': 'eventSearchTable'})))
    # <class 'bs4.element.Tag'>
    print(type(soup.find_all('table', attrs={'id': 'eventSearchTable'})))
    # <class 'bs4.element.ResultSet'>
    

    在第一种情况下,您有一个表,在第二种情况下,一组表(在您的情况下仅由 1 个组成),每个表的类型为 bs4.element.Tag

    因此,您有两个选择,要么使用

    table = soup.find('table', attrs={'id': 'eventSearchTable'})
    

    for table in soup.find_all("table", {"id":"eventSearchTable"}):
    

    【讨论】:

      猜你喜欢
      • 2014-10-23
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 2021-05-31
      • 1970-01-01
      • 1970-01-01
      • 2021-01-16
      • 1970-01-01
      相关资源
      最近更新 更多