【发布时间】:2015-10-08 10:47:20
【问题描述】:
我正在学习 BeautifulSoup,并且有一个网页,其正文如下:
html:
<div>
<table>
<tr>
<td>
<div>
<a name='abc'>....</a>
</div>
</td>
</tr>
</table>
</div>
<a name='pqr'>...</a>
<div>text1</div>
<div>text2</div>
<div>text3</div>
<a name='mno'>...</a>
<div>
<table>
<tr>
<td>
<div>
<a name='xyz'>....</a>
</div>
</td>
</tr>
</table>
</div>
预期结果:
<a name='pqr'>...</a>
<div>text1</div>
<div>text2</div>
<div>text3</div>
<a name='mno'>...</a>
我的意思是,在到达 'a name='xyz'' 标记之前获取所有内容
【问题讨论】:
标签: python web-scraping beautifulsoup html-parsing