【发布时间】:2016-03-09 09:41:22
【问题描述】:
我需要从文件中提取“7 秒前结束”:
<div class="featured__columns">
<div class="featured__column"><i style="color:rgb(149,213,230);" class="fa fa-clock-o"></i> <span title="Today, 11:49am">Ended 7 seconds ago</span></div>
<div class="featured__column featured__column--width-fill text-right"><span title="March 7, 2016, 10:50am">2 days ago</span> by <a style="color:rgb(149,213,230);" href="/user/Eclipsy">Eclipsy</a></div><a href="/user/Eclipsy" class="global__image-outer-wrap global__image-outer-wrap--avatar-small">
<div class="global__image-inner-wrap" style="background-image:url(https://steamcdn-a.akamaihd.net/steamcommunity/public/images/avatars/dc/dc5b8424bd5d17e13dcfe613689921dfc29f4574_medium.jpg);"></div>
</a>
</div>
我试试:
#!/usr/bin/python3
from bs4 import BeautifulSoup
with open("./source.html") as source_html:
soup=BeautifulSoup(source_html.read())
soup=soup.find_all("span")
print(soup[0].string)
一切都很好,但我认为我的方法很愚蠢。有不同的方法来提取数据?
【问题讨论】:
标签: python python-3.x beautifulsoup