【问题标题】:Sorting Dictionaries Specific Keys对字典特定键进行排序
【发布时间】:2020-10-27 16:05:30
【问题描述】:

我最近专门抓取 Hacker News 网站(标题、链接、投票),下面是代码:

import requests
from bs4 import BeautifulSoup

res = requests.get('https://news.ycombinator.com/')
soup = BeautifulSoup(res.text, 'html.parser')

titles = soup.select('.storylink')
subtext = soup.select('.subtext')


def custom_hn(titles, subtext):
    with open('result.txt', 'w', encoding='utf-8') as f:
        for i, item in enumerate(titles):
            title = titles[i].getText()
            link = titles[i].get('href', None)
            vote = subtext[i].select('.score')
            if len(vote):
                points = vote[0].getText().replace(' points', '')
            else:
                points = '0'
            my_dict = {'Title': title, 'Link': link, 'Votes': points}

我的抱怨是,我该如何对投票进行排序,以便它以相反的顺序写入 result.txt 及其相关信息,例如标题、链接。

【问题讨论】:

  • 你可以使用sorted(my_dict.items(), key=lambda x: x["Votes"])
  • 谢谢@bigbounty,我试过了,但它给出了类型错误:TypeError: tuple indices must be integers or slices, not str

标签: python sorting dictionary web-scraping beautifulsoup


【解决方案1】:
import requests, json
from bs4 import BeautifulSoup

res = requests.get('https://news.ycombinator.com/')
soup = BeautifulSoup(res.text, 'html.parser')

titles = soup.select('.storylink')
subtext = soup.select('.subtext')


def custom_hn(titles, subtext):
    data = []
    for i, item in enumerate(titles):
        title = titles[i].getText()
        link = titles[i].get('href', None)
        vote = subtext[i].select('.score')
        if len(vote):
            points = vote[0].getText().replace(' points', '')
        else:
            points = '0'
        data.append({'Title': title, 'Link': link, 'Votes': points})
    
    newlist = sorted(data, key=lambda k: int(k['Votes']))
    
    with open('result.json', 'w', encoding='utf-8') as f:
        json.dump(newlist, f)

对于逆序排序, 将已排序的行更改为 newlist = sorted(data, key=lambda k: int(k['Votes']), reverse=True)

【讨论】:

  • 嘿@bigbounty 如何格式化 result.txt 文件,以便将新列表写入其中,同时在每个标题的每组信息之后留出空间。
猜你喜欢
  • 2016-03-03
  • 1970-01-01
  • 1970-01-01
  • 2018-11-22
  • 1970-01-01
  • 2017-02-28
  • 2018-08-13
  • 1970-01-01
  • 2021-10-27
相关资源
最近更新 更多