【发布时间】:2020-10-27 16:05:30
【问题描述】:
我最近专门抓取 Hacker News 网站(标题、链接、投票),下面是代码:
import requests
from bs4 import BeautifulSoup
res = requests.get('https://news.ycombinator.com/')
soup = BeautifulSoup(res.text, 'html.parser')
titles = soup.select('.storylink')
subtext = soup.select('.subtext')
def custom_hn(titles, subtext):
with open('result.txt', 'w', encoding='utf-8') as f:
for i, item in enumerate(titles):
title = titles[i].getText()
link = titles[i].get('href', None)
vote = subtext[i].select('.score')
if len(vote):
points = vote[0].getText().replace(' points', '')
else:
points = '0'
my_dict = {'Title': title, 'Link': link, 'Votes': points}
我的抱怨是,我该如何对投票进行排序,以便它以相反的顺序写入 result.txt 及其相关信息,例如标题、链接。
【问题讨论】:
-
你可以使用
sorted(my_dict.items(), key=lambda x: x["Votes"]) -
谢谢@bigbounty,我试过了,但它给出了类型错误:TypeError: tuple indices must be integers or slices, not str
标签: python sorting dictionary web-scraping beautifulsoup