【发布时间】:2021-12-08 18:57:08
【问题描述】:
我尝试根据问题刮表:Python BeautifulSoup scrape tables
从顶级解决方案中,我尝试了:
HTML 代码:
<div class="table-frame small">
<table id="rfq-display-line-items-list" class="table">
<thead id="rfq-display-line-items-header">
<tr>
<th>Mfr. Part/Item #</th>
<th>Manufacturer</th>
<th>Product/Service Name</th>
<th>Qty.</th>
<th>Unit</th>
<th>Ship Address</th>
</tr>
</thead>
<tbody id="rfq-display-line-item-0">
<tr>
<td><span class="small">43933</span></td>
<td><span class="small">Anvil International</span></td>
<td><span class="small">Cap Steel Black 1-1/2"</span></td>
<td><span class="small">800</span></td>
<td><span class="small">EA</span></td>
<td><span class="small">1</span></td>
</tr>
<!----><!---->
</tbody><tbody id="rfq-display-line-item-1">
<tr>
<td><span class="small">330035205</span></td>
<td><span class="small">Anvil International</span></td>
<td><span class="small">1-1/2" x 8" Black Steel Nipple</span></td>
<td><span class="small">400</span></td>
<td><span class="small">EA</span></td>
<td><span class="small">1</span></td>
</tr>
<!----><!---->
</tbody><!---->
</table><!---->
</div>
根据解决方案,
我尝试的是:
for tr in soup.find_all('table', {'id': 'rfq-display-line-items-list'}):
tds = tr.find_all('td')
print(tds[0].text, tds[1].text, tds[2].text, tds[3].text, tds[4].text, tds[5].text)
但这只显示第一行,
43933 Anvil International Cap Steel Black 1-1/2" 800 EA 1
我后来发现所有<td> 都存储在列表中。我想打印所有行。
预期输出:
43933 Anvil International Cap Steel Black 1-1/2" 800 EA 1
330035205 Anvil International 1-1/2" x 8" Black Steel Nipple 400 EA 1
【问题讨论】:
标签: python python-3.x web-scraping beautifulsoup web-crawler