【发布时间】:2014-08-30 19:18:25
【问题描述】:
我刚开始使用 BeautifulSoup,但遇到了问题。我在下面设置了一个 html sn-p 并制作了一个 BeautifulSoup 对象:
html_snippet = '<p class="course"><span class="text84">Ae 100. Research in Aerospace. </span><span class="text85">Units to be arranged in accordance with work accomplished. </span><span class="text83">Open to suitably qualified undergraduates and first-year graduate students under the direction of the staff. Credit is based on the satisfactory completion of a substantive research report, which must be approved by the Ae 100 adviser and by the option representative. </span> </p>'
subject = BeautifulSoup(html_snippet)
我已经尝试了几个 find 和 find_all 操作,如下所示,但我得到的只是一个空列表:
subject.find(text = 'A')
subject.find(text = 'Research')
subject.next_element.find('A')
subject.find_all(text = 'A')
当我之前从计算机上的 html 文件创建 BeautifulSoup 对象时,find 和 find_all 操作都运行良好。但是,当我通过 urllib2 从在线阅读网页中提取 html_sn-p 时,我遇到了问题。
谁能指出问题出在哪里?
【问题讨论】:
-
您没有任何文本完全等于“A”或“Research”的节点 - 您的节点的第一个单词为
A(或以 A 开头的单词,例如:ae) 和另一个研究...
标签: python beautifulsoup