【问题标题】:Nested JSON from CSV来自 CSV 的嵌套 JSON
【发布时间】:2016-03-21 08:44:24
【问题描述】:

我想根据这个 CSV 文件创建一个嵌套的 JSON(它只是一个 sn-p)

    Datum,Position,Herkunft,Entscheidungen insgesamt,Insgesamt_monat,Asylberechtigt,Asylberechtigt monat,Asylberechtigt Prozent,Flüchtling,Flüchtling monat,Flüchting Prozent,Gewährung von subisdiärem Schutz,Gewährung monat,Prozent,Abschiebungsverbot,Abschiebungsverbot monat,Prozent,Unbegrenzte Ablehnungen,Unbegrenzte Ablehnungen monat,Prozent,Ablehnung,Ablehnung monat,Prozent,sonstige Verfahrenserledigungen,,Prozent
    2015-10-01,4,Afghanistan,4540,483,37,1,0.8,1188,139,26.2,234,33,5.2,516,61,11.4,538,63,11.9,29,3,0.6,1998,183,44
    2015-09-01,4,Afghanistan,4057,397,36,8,0.9,1049,127,25.9,201,29,5,455,46,11.2,475,22,11.7,26,3,0.6,1815,162,44.7
    2015-08-01,5,Afghanistan,3660,320,28,1,0.8,922,155,25.2,172,12,4.7,409,43,11.2,453,22,12.4,23,2,0.6,1653,85,45.2
    2015-07-01,6,Afghanistan,3340,429,27,4,0.8,767,84,23,160,28,4.8,366,53,11,431,54,12.9,21,2,0.6,1568,204,46.9
    2015-06-01,6,Afghanistan,2911,639,23,2,0.8,683,184,23.5,132,41,4.5,313,64,10.8,377,74,13,19,3,0.7,1364,271,46.9
    2015-05-01,6,Afghanistan,2272,434,21,0,0.9,499,115,22,91,16,4,249,47,11,303,42,13.3,16,1,0.7,1093,213,48.1
    2015-04-01,6,Afghanistan,1838,462,21,4,1.1,384,75,20.9,75,17,4.1,202,44,11,261,60,14.2,15,4,0.8,880,258,47.9
    2015-03-01,5,Afghanistan,1376,527,17,8,1.2,309,123,22.5,58,42,4.2,158,58,11.5,201,70,14.6,11,1,0.8,622,225,45.2
    2015-02-01,5,Afghanistan,849,431,9,9,1.1,186,81,21.9,16,12,1.9,100,42,11.8,131,65,15.4,10,4,1.2,397,218,46.8
    2015-01-01,5,Afghanistan,418,418,0,0,0,105,105,25.1,4,4,1,58,58,13.9,66,66,15.8,6,6,1.4,179,179,42.8
    2015-10-01,2,Albanien,28011,7107,0,0,0,7,4,0,23,7,0.1,18,1,0.1,864,164,3.1,24688,6250,88.1,2411,681,8.6
    2015-09-01,2,Albanien,20904,7326,0,0,0,3,0,0,16,3,0.1,17,6,0.1,700,153,3.3,18438,6657,88.2,1730,507,8.3
    2015-08-01,2,Albanien,13578,3955,0,0,0,3,0,0,13,0,0.1,11,0,0.1,547,124,4,11781,3630,86.8,1223,201,9
    2015-07-01,3,Albanien,9623,4673,0,0,0,3,0,0,13,2,0.1,11,4,0.1,423,164,4.4,8151,4275,84.7,1022,228,10.6
    2015-06-01,3,Albanien,4950,2099,0,0,0,3,0,0.1,11,8,0.2,7,0,0.1,259,75,5.2,3876,1807,78.3,794,209,16
    2015-05-01,3,Albanien,2851,1210,0,0,0,3,0,0.1,3,3,0.1,7,0,0.2,184,52,6.5,2069,1001,72.6,585,154,20.5
    2015-04-01,3,Albanien,1641,799,0,0,0,3,0,0.2,0,0,0,7,1,0.4,132,49,8,1068,581,65.1,431,168,26.3
    2015-03-01,3,Albanien,842,331,0,0,0,3,1,0.4,0,0,0,6,3,0.7,83,12,9.9,487,212,57.8,263,103,31.2
    2015-02-01,4,Albanien,511,233,0,0,0,2,2,0.4,0,0,0,3,3,0.6,71,13,13.9,275,127,53.8,160,88,31.3
    2015-01-01,4,Albanien,278,278,0,0,0,0,0,0,0,0,0,0,0,0,58,58,20.9,148,148,53.2,72,72,25.9
    2015-05-01,10,Bosnien und Herzegowina,1822,227,0,0,0,1,0,0.1,0,0,0,5,2,0.3,12,0,0.7,1538,165,84.4,266,60,14.6
    2015-04-01,9,Bosnien und Herzegowina,1595,206,0,0,0,1,0,0.1,0,0,0,3,0,0.2,12,1,0.8,1373,166,86.1,206,39,12.9
    2015-03-01,9,Bosnien und Herzegowina,1389,341,0,0,0,1,0,0.1,0,0,0,3,1,0.2,11,4,0.8,1207,276,86.9,167,60,12
    2015-02-01,10,Bosnien und Herzegowina,1048,1048,0,0,0,1,1,0.1,0,0,0,2,2,0.2,7,7,0.7,931,931,88.8,107,107,10.2
    2015-10-01,7,Eritrea,5031,1153,16,2,0.3,3979,1070,79.1,326,30,6.5,19,5,0.4,23,2,0.5,5,1,0.1,663,43,13.2
    2015-09-01,8,Eritrea,3878,702,14,1,0.4,2909,519,75,296,148,7.6,14,0,0.4,21,1,0.5,4,1,0.1,620,32,16
    2015-08-01,8,Eritrea,3176,527,13,1,0.4,2390,505,75.3,148,7,4.7,14,2,0.4,20,0,0.6,3,-1,0.1,588,13,18.5
    2015-07-01,8,Eritrea,2649,542,12,2,0.5,1885,492,71.2,141,10,5.3,12,2,0.5,20,5,0.8,4,0,0.2,575,31,21.7
2015-10-01,10,Ungekl√§rt,2987,455,30,1,1,2249,441,75.3,2,0,0.1,2,0,0.1,27,0,0.9,268,33,9,409,-20,13.7
2015-09-01,10,Ungekl√§rt,2532,2147,29,22,1.1,1808,1503,71.4,2,2,0.1,2,2,0.1,27,23,1.1,235,206,9.3,429,389,16.9
2015-01-01,9,Ungekl√§rt,385,385,7,7,1.8,305,305,79.2,0,0,0,0,0,0,4,4,1,29,29,7.5,40,40,10.4

以这种形式

        "Irak": {}, 
"Mazedonien": {}, 
"Serbien": {}, 
"Ungekl\u221a\u00a7rt": {
    "Insgesamt_monat": [
        "455", 
        "455", 
        "2147", 
        "385"
    ], 
    "Position": [
        "10", 
        "10", 
        "10", 
        "9"
    ], 
    "Entscheidungen insgesamt": [
        "2987", 
        "2987", 
        "2532", 
        "385"
    ], 
    "Datum": [
        "2015-10-01", 
        "2015-10-01", 
        "2015-09-01", 
        "2015-01-01"
    ], 
    "Asylberechtigt": [
        "30", 
        "30", 
        "29", 
        "7"
    ]
}, 
"Albanien": {}, 
"Afghanistan": {}, 
"Kosovo": {}, 
"Summe 1 bis 10": {}, 
"Syrien,Arabische Republik": {}, 
"Eritrea": {}, 
"Bosnien und Herzegowina": {}, 
"Summe gesamt": {}, 
"Pakistan": {}, 
"Nigeria": {}, 
"Somalia": {}

这是我的代码

import csv
import json

output = {}
country =  { "Datum": [], "Position": [], "Entscheidungen insgesamt": [], "Insgesamt_monat": [], "Asylberechtigt": [] }
lastCountry = ""

with open('test.csv') as csv_file:
    for row in csv.DictReader(csv_file):

        country['Datum'].append(row['Datum'])
        country['Position'].append(row['Position'])
        country['Entscheidungen insgesamt'].append(row['Entscheidungen insgesamt'])
        country['Insgesamt_monat'].append(row['Insgesamt_monat'])
        country['Asylberechtigt'].append(row['Asylberechtigt'])

        if output.has_key(row['Herkunft']):
            output[row['Herkunft']].update(country)
        else:
            country.clear()
            country = {"Datum": [row['Datum']], "Position": [row['Position']], "Entscheidungen insgesamt": [row['Entscheidungen insgesamt']], "Insgesamt_monat": [row['Insgesamt_monat']], "Asylberechtigt": [row['Asylberechtigt']] }
            output[row['Herkunft']] = country

    print(json.dumps(output, indent=4))
#    with open('data.txt', 'w') as outfile:

如您所见,除一个国家/地区以外的所有国家/地区都没有从 csv 中获取数据。哪里错了。如何导出 json?我实际上是将打印的内容复制到我的文本编辑器中

【问题讨论】:

  • 到底是什么问题?
  • 是否允许使用外部模块?我建议使用 pandas 提供可能合适的to_json。要将 dict 导出为 json,您应该查看 Python 的 json 模块和基本 I/O 文件处理。
  • @RickyA 问题是:为什么 json 输出只包含一个 'Herkunft' ("Ungekl\u221a\u00a7rt") 而其他国家没有任何条目。
  • @albert 我怎么知道我是否被允许?我会试试熊猫
  • 根据您的工作环境、任务等,由于多种原因(软件安全、系统维护等),您可能对非标准模块有一些限制。如果您没有任何限制,请尝试使用 pandas,它在处理大量数据时会非常强大...

标签: python json csv


【解决方案1】:

在您的代码中,问题出在else 子句:您做了什么:

  1. 重置country -- 这将删除您刚刚更新的行
  2. 然后更新output,此时你的country已经为空了

你需要做的是:

  1. 追加country tooutput`
  2. 重置country
  3. 然后用当前行更新country

顺序很重要。

代码如下:

import csv
import json

output = {}
country = {}

with open('test.csv') as csv_file:
    for row in csv.DictReader(csv_file):
        if not output.has_key(row['Herkunft']):
            output[row['Herkunft']] = country
            country = {"Datum": [], "Position": [], "Entscheidungen insgesamt": [], "Insgesamt_monat": [], "Asylberechtigt": [] }

        country['Datum'].append(row['Datum'])
        country['Position'].append(row['Position'])
        country['Entscheidungen insgesamt'].append(row['Entscheidungen insgesamt'])
        country['Insgesamt_monat'].append(row['Insgesamt_monat'])
        country['Asylberechtigt'].append(row['Asylberechtigt'])
        output[row['Herkunft']] = country

    output[row['Herkunft']] = country  # Catch the last country
    print json.dumps(output, indent=4)

【讨论】:

    【解决方案2】:

    你的缩进是错误的。现在您打开 outfile 并为 每个 国家/地区写入内容。因此,每个国家/地区都会覆盖前一个国家/地区的输出。 [编辑]:更多问题。你在那里以一种奇怪的方式使用国家字典。这是一个更好的版本。

    import csv
    import json
    
    output = {}
    
    with open('test.csv') as csv_file:
        for row in csv.DictReader(csv_file):
            if row['Herkunft'] in output:
                country = output[row['Herkunft']]
            else:
                country = { "Datum": [], "Position": [], "Entscheidungen insgesamt": [], "Insgesamt_monat": [], "Asylberechtigt": [] }
                output[row['Herkunft']] = country
            country['Datum'].append(row['Datum'])
            country['Position'].append(row['Position'])
            country['Entscheidungen insgesamt'].append(row['Entscheidungen insgesamt'])
            country['Insgesamt_monat'].append(row['Insgesamt_monat'])
            country['Asylberechtigt'].append(row['Asylberechtigt'])
    
    print(json.dumps(output, indent=4))
    with open('data.txt', 'w') as outfile:
        outfile.write(json.dumps(output, indent=4))
    

    【讨论】:

    • 谢谢。我们现在可以导出 json。但是除了最后一个国家(“Ungekl\u221a\u00a7rt”)之外的一些国家在json中没有任何条目并且只返回一个空对象但它应该像“Ungekl\u221a\u00a7rt”
    • 你运行了我更新的代码吗?我还更改了国家/地区字典的处理方式,根据您的描述,您会看到该错误。
    • 您可以在接受具有相同解决方案的后续问题之前刷新问题...
    • 抱歉,没看到
    猜你喜欢
    • 2015-08-26
    • 1970-01-01
    • 2023-03-15
    • 1970-01-01
    • 2014-09-20
    • 1970-01-01
    • 2017-10-16
    • 1970-01-01
    • 1970-01-01
    相关资源
    最近更新 更多