【问题标题】:Filtering JSON data in python在python中过滤JSON数据
【发布时间】:2023-03-25 13:21:02
【问题描述】:

我有一个包含 5000 个项目的电影演员阵容的 CSV JSON 文件。第一项看起来像这样:

[{
    "cast_id": 5,
    "character": "John Carter",
    "credit_id": "52fe479ac3a36847f813ea75",
    "gender": 2,
    "id": 60900,
    "name": "Taylor Kitsch",
    "order": 0
}, {
    "cast_id": 20,
    "character": "Dejah Thoris",
    "credit_id": "52fe479ac3a36847f813eab3",
    "gender": 1,
    "id": 21044,
    "name": "Lynn Collins",
    "order": 1
}, {
    "cast_id": 7,
    "character": "Sola",
    "credit_id": "52fe479ac3a36847f813ea79",
    "gender": 1,
    "id": 2206,
    "name": "Samantha Morton",
    "order": 2
}, {
    "cast_id": 3,
    "character": "Tars Tarkas",
    "credit_id": "52fe479ac3a36847f813ea6d",
    "gender": 2,
    "id": 5293,
    "name": "Willem Dafoe",
    "order": 3
}, {
    "cast_id": 8,
    "character": "Tal Hajus",
    "credit_id": "52fe479ac3a36847f813ea7d",
    "gender": 2,
    "id": 19159,
    "name": "Thomas Haden Church",
    "order": 4
}, {
    "cast_id": 2,
    "character": "Matai Shang",
    "credit_id": "52fe479ac3a36847f813ea69",
    "gender": 2,
    "id": 2983,
    "name": "Mark Strong",
    "order": 5
}, {
    "cast_id": 4,
    "character": "Tardos Mors",
    "credit_id": "52fe479ac3a36847f813ea71",
    "gender": 2,
    "id": 8785,
    "name": "Ciar\u00e1n Hinds",
    "order": 6
}, {
    "cast_id": 9,
    "character": "Sab Than",
    "credit_id": "52fe479ac3a36847f813ea81",
    "gender": 2,
    "id": 17287,
    "name": "Dominic West",
    "order": 7
}, {
    "cast_id": 10,
    "character": "Kantos Kan",
    "credit_id": "52fe479ac3a36847f813ea85",
    "gender": 2,
    "id": 17648,
    "name": "James Purefoy",
    "order": 8
}, {
    "cast_id": 11,
    "character": "Powell",
    "credit_id": "52fe479ac3a36847f813ea89",
    "gender": 2,
    "id": 17419,
    "name": "Bryan Cranston",
    "order": 9
}, {
    "cast_id": 12,
    "character": "Sarkoja",
    "credit_id": "52fe479ac3a36847f813ea8d",
    "gender": 1,
    "id": 6416,
    "name": "Polly Walker",
    "order": 10
}, {
    "cast_id": 13,
    "character": "Edgar Rice Burroughs",
    "credit_id": "52fe479ac3a36847f813ea91",
    "gender": 2,
    "id": 57675,
    "name": "Daryl Sabara",
    "order": 11
}, {
    "cast_id": 14,
    "character": "Stayman #1 / Helm",
    "credit_id": "52fe479ac3a36847f813ea95",
    "gender": 2,
    "id": 89830,
    "name": "Arkie Reece",
    "order": 12
}, {
    "cast_id": 15,
    "character": "Stayman #3",
    "credit_id": "52fe479ac3a36847f813ea99",
    "gender": 2,
    "id": 205278,
    "name": "Davood Ghadami",
    "order": 13
}, {
    "cast_id": 16,
    "character": "Lightmaster",
    "credit_id": "52fe479ac3a36847f813ea9d",
    "gender": 1,
    "id": 218345,
    "name": "Pippa Nixon",
    "order": 14
}, {
    "cast_id": 46,
    "character": "Thern #2",
    "credit_id": "584ef986c3a3682a940010d0",
    "gender": 2,
    "id": 1390394,
    "name": "James Embree",
    "order": 15
}, {
    "cast_id": 77,
    "character": "Thern #1",
    "credit_id": "58c68f82c3a3684114014f58",
    "gender": 0,
    "id": 1518112,
    "name": "Philip Philmar",
    "order": 16
}, {
    "cast_id": 47,
    "character": "Pretty Woman in NYC Doorway",
    "credit_id": "584f133992514107110024b8",
    "gender": 1,
    "id": 1721985,
    "name": "Emily Tierney",
    "order": 17
}, {
    "cast_id": 48,
    "character": "Telegraph Clerk",
    "credit_id": "584f16d192514107000026a2",
    "gender": 2,
    "id": 1721992,
    "name": "Edmund Kente",
    "order": 18
}, {
    "cast_id": 49,
    "character": "Dalton",
    "credit_id": "584f1a94c3a3682a8d0026e7",
    "gender": 2,
    "id": 118617,
    "name": "Nicholas Woodeson",
    "order": 19
}, {
    "cast_id": 50,
    "character": "Stable Boy",
    "credit_id": "584f1f2b9251410700002be9",
    "gender": 2,
    "id": 1722006,
    "name": "Kyle Agnew",
    "order": 20
}, {
    "cast_id": 51,
    "character": "Dix the Storekeeper",
    "credit_id": "584f28aec3a3683150000214",
    "gender": 2,
    "id": 130129,
    "name": "Don Stark",
    "order": 21
}, {
    "cast_id": 52,
    "character": "Rowdy #1",
    "credit_id": "58580465c3a3683150056d0c",
    "gender": 2,
    "id": 65716,
    "name": "Josh Daugherty",
    "order": 22
}, {
    "cast_id": 53,
    "character": "Rowdy #2",
    "credit_id": "58580cd89251411a4605f517",
    "gender": 2,
    "id": 1724736,
    "name": "Jared Cyr",
    "order": 23
}, {
    "cast_id": 37,
    "character": "Stockade Guard",
    "credit_id": "54e5a58d925141529c000f89",
    "gender": 2,
    "id": 62082,
    "name": "Christopher Goodman",
    "order": 24
}, {
    "cast_id": 54,
    "character": "Sarah Carter",
    "credit_id": "585823dc925141594100c816",
    "gender": 1,
    "id": 1367241,
    "name": "Amanda Clayton",
    "order": 25
}, {
    "cast_id": 170,
    "character": "Apache #1 (as Joe Billingiere)",
    "credit_id": "595ad40c9251410bfa04831e",
    "gender": 0,
    "id": 1844319,
    "name": "Joseph Billingiere",
    "order": 26
}
]

我只需要这个文件中“name”的值。例如,在这个项目中,它将是:

Taylor Kitsch, Lynn Collins, Samantha Morton, Willem Dafoe, Thomas Haden Church, Mark Strong, Ciar Hinds, Dominic West, James Purefoy, Bryan Cranston, Polly Walker

键为“名称”的含义。

我该怎么做呢?

【问题讨论】:

  • 这不是 CSV,是 JSON

标签: python arrays json


【解决方案1】:

您显示的是单个 JSON 数组,而不是 CSV。 (文件扩展名与 Python 无关)

从每个 JSON 数组行中的对象解析名称

import json

with open("file.txt") as f:
    for line in f:
        names = (x['name'] for x in json.loads(line))
        for name in names:
            print(name) 

【讨论】:

  • 感谢您的回答!我能够得到所有的名字,但问题是它们一个接一个地在一起,这意味着我无法根据电影来区分它们。我可以根据特定电影安排它们吗?
  • 不清楚数据的哪一部分包含有关特定电影的任何信息,但您可以先将此处的内部循环替换为 print(list(names))
【解决方案2】:

这样的事情应该可以工作。您使用 csv 库将行转换为数组以进行迭代。 ast 库会将字符串转换为您可以键入的 dict/json 对象。

import ast
import csv
names = []

with open('csvFile.csv', 'r') as f:
    reader = csv.reader(f)
    for row in reader:
        names.append(ast.literal_eval(row)[name])
        # or you could print(name) here. 

【讨论】:

  • 您是否尝试使用包含问题中提供的行的 CSV 文件?
【解决方案3】:

根据您提供的内容,这是一个非常基本的示例:

>>> movie =  [{"cast_id": 5, "character": "John Carter", "credit_id": "52fe479ac3a36847f813ea75", "gender": 2, "id": 60900, "name": "Taylor Kitsch", "order": 0}, {"cast_id": 20, "character": "Dejah Thoris", "credit_id": "52fe479ac3a36847f813eab3", "gender": 1, "id": 21044, "name": "Lynn Collins", "order": 1}]
>>> for i in movie:
...     print(i["name"])
...
Taylor Kitsch
Lynn Collins

基本上,这会遍历列表中的不同字典,然后提取与 "name" 键关联的值。

需要更多信息才能对整个 csv 文件执行此操作,但这可能会帮助您入门。

【讨论】:

  • 我在一个 excel 文件中有 5000 行这样的行。我怎样才能获得所有这 5000 行的演员姓名?
  • 你把csv文件读入python了吗?如果您还没有,我鼓励您查看csv module for python;更具体地说是 csv.readercsv.DictReader 类。
  • 我只是放弃了 OP 提供的内容。我没有足够的业力来评论 OP 以进行澄清......但你是对的,现在看它是 JSON 文件,而不是 CSV。
  • 我有一个包含多列的 csv 文件,我从中提取了单列演员表。此列有 5000 行 JSON 格式
【解决方案4】:

您可以直接使用过滤器提出请求

{'$filter': ['name eq Lynn Collins']}

【讨论】:

    【解决方案5】:
    my_list = [{
        "cast_id": 5,
        "character": "John Carter",
        "credit_id": "52fe479ac3a36847f813ea75",
        "gender": 2,
        "id": 60900,
        "name": "Taylor Kitsch",
        "order": 0
    }, {
        "cast_id": 20,
        "character": "Dejah Thoris",
        "credit_id": "52fe479ac3a36847f813eab3",
        "gender": 1,
        "id": 21044,
        "name": "Lynn Collins",
        "order": 1
    }, {
        "cast_id": 7,
        "character": "Sola",
        "credit_id": "52fe479ac3a36847f813ea79",
        "gender": 1,
        "id": 2206,
        "name": "Samantha Morton",
        "order": 2
    }, {
        "cast_id": 3,
        "character": "Tars Tarkas",
        "credit_id": "52fe479ac3a36847f813ea6d",
        "gender": 2,
        "id": 5293,
        "name": "Willem Dafoe",
        "order": 3
    }, {
        "cast_id": 8,
        "character": "Tal Hajus",
        "credit_id": "52fe479ac3a36847f813ea7d",
        "gender": 2,
        "id": 19159,
        "name": "Thomas Haden Church",
        "order": 4
    }, {
        "cast_id": 2,
        "character": "Matai Shang",
        "credit_id": "52fe479ac3a36847f813ea69",
        "gender": 2,
        "id": 2983,
        "name": "Mark Strong",
        "order": 5
    }, {
        "cast_id": 4,
        "character": "Tardos Mors",
        "credit_id": "52fe479ac3a36847f813ea71",
        "gender": 2,
        "id": 8785,
        "name": "Ciar\u00e1n Hinds",
        "order": 6
    }, {
        "cast_id": 9,
        "character": "Sab Than",
        "credit_id": "52fe479ac3a36847f813ea81",
        "gender": 2,
        "id": 17287,
        "name": "Dominic West",
        "order": 7
    }, {
        "cast_id": 10,
        "character": "Kantos Kan",
        "credit_id": "52fe479ac3a36847f813ea85",
        "gender": 2,
        "id": 17648,
        "name": "James Purefoy",
        "order": 8
    }, {
        "cast_id": 11,
        "character": "Powell",
        "credit_id": "52fe479ac3a36847f813ea89",
        "gender": 2,
        "id": 17419,
        "name": "Bryan Cranston",
        "order": 9
    }, {
        "cast_id": 12,
        "character": "Sarkoja",
        "credit_id": "52fe479ac3a36847f813ea8d",
        "gender": 1,
        "id": 6416,
        "name": "Polly Walker",
        "order": 10
    }, {
        "cast_id": 13,
        "character": "Edgar Rice Burroughs",
        "credit_id": "52fe479ac3a36847f813ea91",
        "gender": 2,
        "id": 57675,
        "name": "Daryl Sabara",
        "order": 11
    }, {
        "cast_id": 14,
        "character": "Stayman #1 / Helm",
        "credit_id": "52fe479ac3a36847f813ea95",
        "gender": 2,
        "id": 89830,
        "name": "Arkie Reece",
        "order": 12
    }, {
        "cast_id": 15,
        "character": "Stayman #3",
        "credit_id": "52fe479ac3a36847f813ea99",
        "gender": 2,
        "id": 205278,
        "name": "Davood Ghadami",
        "order": 13
    }, {
        "cast_id": 16,
        "character": "Lightmaster",
        "credit_id": "52fe479ac3a36847f813ea9d",
        "gender": 1,
        "id": 218345,
        "name": "Pippa Nixon",
        "order": 14
    }, {
        "cast_id": 46,
        "character": "Thern #2",
        "credit_id": "584ef986c3a3682a940010d0",
        "gender": 2,
        "id": 1390394,
        "name": "James Embree",
        "order": 15
    }, {
        "cast_id": 77,
        "character": "Thern #1",
        "credit_id": "58c68f82c3a3684114014f58",
        "gender": 0,
        "id": 1518112,
        "name": "Philip Philmar",
        "order": 16
    }, {
        "cast_id": 47,
        "character": "Pretty Woman in NYC Doorway",
        "credit_id": "584f133992514107110024b8",
        "gender": 1,
        "id": 1721985,
        "name": "Emily Tierney",
        "order": 17
    }, {
        "cast_id": 48,
        "character": "Telegraph Clerk",
        "credit_id": "584f16d192514107000026a2",
        "gender": 2,
        "id": 1721992,
        "name": "Edmund Kente",
        "order": 18
    }, {
        "cast_id": 49,
        "character": "Dalton",
        "credit_id": "584f1a94c3a3682a8d0026e7",
        "gender": 2,
        "id": 118617,
        "name": "Nicholas Woodeson",
        "order": 19
    }, {
        "cast_id": 50,
        "character": "Stable Boy",
        "credit_id": "584f1f2b9251410700002be9",
        "gender": 2,
        "id": 1722006,
        "name": "Kyle Agnew",
        "order": 20
    }, {
        "cast_id": 51,
        "character": "Dix the Storekeeper",
        "credit_id": "584f28aec3a3683150000214",
        "gender": 2,
        "id": 130129,
        "name": "Don Stark",
        "order": 21
    }, {
        "cast_id": 52,
        "character": "Rowdy #1",
        "credit_id": "58580465c3a3683150056d0c",
        "gender": 2,
        "id": 65716,
        "name": "Josh Daugherty",
        "order": 22
    }, {
        "cast_id": 53,
        "character": "Rowdy #2",
        "credit_id": "58580cd89251411a4605f517",
        "gender": 2,
        "id": 1724736,
        "name": "Jared Cyr",
        "order": 23
    }, {
        "cast_id": 37,
        "character": "Stockade Guard",
        "credit_id": "54e5a58d925141529c000f89",
        "gender": 2,
        "id": 62082,
        "name": "Christopher Goodman",
        "order": 24
    }, {
        "cast_id": 54,
        "character": "Sarah Carter",
        "credit_id": "585823dc925141594100c816",
        "gender": 1,
        "id": 1367241,
        "name": "Amanda Clayton",
        "order": 25
    }, {
        "cast_id": 170,
        "character": "Apache #1 (as Joe Billingiere)",
        "credit_id": "595ad40c9251410bfa04831e",
        "gender": 0,
        "id": 1844319,
        "name": "Joseph Billingiere",
        "order": 26
    }
    ]
    
    my_names = [temp_dict['name'] for temp_dict in my_list]
    

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 2021-10-30
      • 2021-01-06
      • 1970-01-01
      • 2021-03-05
      • 2018-01-24
      • 2020-08-29
      • 1970-01-01
      • 2023-01-29
      相关资源
      最近更新 更多