无法将输出作为格式正确的字典答案

【问题标题】：Unable to get the output as properly formatted dictionary无法将输出作为格式正确的字典
【发布时间】：2018-10-05 03:44:48
【问题描述】：

我用 python 编写了一个爬虫来解析网页中的一些数据。我的意图是将数据存储在字典中。我只是尝试使用包含单个玩家信息的单个tr 而不是展示完整的表。数据正在通过，但输出的格式不是字典的样子。任何有助于使其准确的帮助将不胜感激。

这是我的尝试：

import requests
from bs4 import BeautifulSoup

URL = "https://fantasy.premierleague.com/player-list/"

def get_data(link):
    res = requests.get(link,headers={"User-Agent":"Mozilla/5.0"})
    soup = BeautifulSoup(res.text,"lxml")
    data = []
    for content in soup.select("div.ism-container"):
        itmval = {}
        itmval['name'] = content.select_one("h2").text
        itmval['player_info'] = [[item.get_text(strip=True) for item in items.select("td")] for items in content.select(" table:nth-of-type(1) tr:nth-of-type(2)")]
        data.append(itmval)

    print(data)

if __name__ == '__main__':
    get_data(URL)

我得到的输出：

[{'name': 'Goalkeepers', 'player_info': [['De Gea', 'Man Utd', '161', '£5.9']]}]

我期望的输出：

[{'name': 'Goalkeepers', 'player_info': ['De Gea', 'Man Utd', '161', '£5.9']}]

顺便说一句，我打算解析整个表格，但我在这里展示了一个最小的部分，供您观察。

【问题讨论】：

itmval['player_info'] = [[item.get_text(strip=True) for item in items.select("td")] for items in content.select(" table:nth-of-type(1) tr:nth-of-type(2)")] 这将创建一个列表列表。所以这是意料之中的。您想展平列表吗？

标签： python python-3.x dictionary web-scraping

【解决方案1】：

如果你想使用嵌套列表推导，请尝试替换

[[item.get_text(strip=True) for item in items.select("td")] for items in content.select(" table:nth-of-type(1) tr:nth-of-type(2)")]

与

[item.get_text(strip=True) for items in content.select(" table:nth-of-type(1) tr:nth-of-type(2)") for item in items.select("td")]

【讨论】：

【解决方案2】：

player_info 等于下面的表达式（简化了一点）：

player_info = [[item for item in items] for items in content]

content 似乎只有一项。你想要的可能是这样的：

 player_info = [item for item in content]

如果内容有多个项目，请删除第一个代码块中的第二对 [ ... ]。

【讨论】：