【问题标题】:Looping though a list trying to create an dict遍历列表尝试创建字典
【发布时间】:2021-06-27 17:07:51
【问题描述】:

我有一个数据示例列表:

res = { 'results': [
{'consumption': 0.025, 'interval_start': '2021-06-27T00:00:00+01:00', 'interval_end': '2021-06-27T00:30:00+01:00'},
{'consumption': 0.043, 'interval_start': '2021-06-26T23:30:00+01:00', 'interval_end': '2021-06-27T00:00:00+01:00'},
{'consumption': 0.053, 'interval_start': '2021-06-26T23:00:00+01:00', 'interval_end': '2021-06-26T23:30:00+01:00'},
{'consumption': 0.056, 'interval_start': '2021-06-26T22:30:00+01:00', 'interval_end': '2021-06-26T23:00:00+01:00'},
{'consumption': 0.031, 'interval_start': '2021-06-26T22:00:00+01:00', 'interval_end': '2021-06-26T22:30:00+01:00'},
{'consumption': 0.129, 'interval_start': '2021-06-26T21:30:00+01:00', 'interval_end': '2021-06-26T22:00:00+01:00'},
{'consumption': 0.19,  'interval_start': '2021-06-26T21:00:00+01:00', 'interval_end': '2021-06-26T21:30:00+01:00'},
{'consumption': 0.164, 'interval_start': '2021-06-26T20:30:00+01:00', 'interval_end': '2021-06-26T21:00:00+01:00'},
{'consumption': 0.145, 'interval_start': '2021-06-26T20:00:00+01:00', 'interval_end': '2021-06-26T20:30:00+01:00'},
{'consumption': 0.213, 'interval_start': '2021-06-26T19:30:00+01:00', 'interval_end': '2021-06-26T20:00:00+01:00'},
{'consumption': 0.167, 'interval_start': '2021-06-26T19:00:00+01:00', 'interval_end': '2021-06-26T19:30:00+01:00'},
{'consumption': 0.333, 'interval_start': '2021-06-26T18:30:00+01:00', 'interval_end': '2021-06-26T19:00:00+01:00'},
{'consumption': 0.133, 'interval_start': '2021-06-26T18:00:00+01:00', 'interval_end': '2021-06-26T18:30:00+01:00'},
{'consumption': 0.211, 'interval_start': '2021-06-26T17:30:00+01:00', 'interval_end': '2021-06-26T18:00:00+01:00'},
{'consumption': 0.135, 'interval_start': '2021-06-26T17:00:00+01:00', 'interval_end': '2021-06-26T17:30:00+01:00'},
{'consumption': 0.158, 'interval_start': '2021-06-26T16:30:00+01:00', 'interval_end': '2021-06-26T17:00:00+01:00'},
{'consumption': 0.073, 'interval_start': '2021-06-26T16:00:00+01:00', 'interval_end': '2021-06-26T16:30:00+01:00'},
{'consumption': 0.077, 'interval_start': '2021-06-26T15:30:00+01:00', 'interval_end': '2021-06-26T16:00:00+01:00'},
{'consumption': 0.125, 'interval_start': '2021-06-26T15:00:00+01:00', 'interval_end': '2021-06-26T15:30:00+01:00'},
{'consumption': 0.201, 'interval_start': '2021-06-26T14:30:00+01:00', 'interval_end': '2021-06-26T15:00:00+01:00'},
{'consumption': 0.043, 'interval_start': '2021-06-26T14:00:00+01:00', 'interval_end': '2021-06-26T14:30:00+01:00'},
] }

我想做的是遍历上面的数据并创建一个字典数据结构,我试图创建的一个例子是:

{
  "2021": {
    "06": {
      "01": [
        {
          "interval_start": "23:00",
          "interval_end": "23:30",
          "consumption": "0.021"
        },
        {
          "interval_start": "22:30",
          "interval_end": "23:00",
          "consumption": "0.021"
        }
      ],
      "02": [
        {
          "interval_start": "23:00",
          "interval_end": "23:30",
          "consumption": "0.021"
        },
        {
          "interval_start": "22:30",
          "interval_end": "23:00",
          "consumption": "0.021"
        }
      ]
    }
  }
}

我为此编写的代码是:

main_obj = {}
for i in res['results']:    
    
    date = i["interval_start"].split("T")[0].split("-")

    
    insert_obj = {
        "interval_start" : i['interval_start'],
        "interval_end": i["interval_end"],
        "consumption": i["consumption"]
        
    }
    
    main_obj[date[0]] = {}
    main_obj[date[0]][date[1]] = {}
    main_obj[date[0]][date[1]][date[2]] = []
    
    main_obj[date[0]][date[1]][date[2]].append(insert_obj)
    
     
print(main_obj)

res['results'] 是上面的字典列表。当我打印出来时,我得到:

{
    '2021': {
        '06': {
            '26': [{
                'interval_start': '2021-06-26T14:00:00+01:00',
                'interval_end': '2021-06-26T14:30:00+01:00',
                'consumption': 0.043
            }]
        }
    }
}

我遇到的问题是为什么当我遍历每个字典时,它没有被添加到列表main_obj[date[0]][date[1]][date[2]] 中?此外,由于 dicts 具有唯一键,为什么我只看到 26 而不是 27 的插入?哪个在索引 0 处?

任何帮助都将不胜感激,因为我已经为此苦恼了一段时间!

【问题讨论】:

  • 如果你正在处理更大的数据或有多个操作,你可能想看看pandas library。

标签: python list dictionary


【解决方案1】:

您正在使用main_obj[date[0]] = {} 等无条件赋值覆盖任何现有的字典/列表;如果date[0] 已经被看到,那么您将删除之前的所有数据。

改用setdefault 方法。 (我不确定 PEP-8 批准的分行是什么样的。)

(main_obj
  .setdefault(date[0], {})
  .setdefault(date[1], {})
  .setdefault(date[2], [])
).append(insert_obj)

【讨论】:

  • 缺少一个)。我真的不喜欢显式的续行,PEP-8 明确要求在括号内使用隐式换行而不是反斜杠。
【解决方案2】:

正如@chepner 所说,这里的问题是,在循环的每次迭代中,如果存在与某个键关联的现有值,那么您将覆盖与某个字典键关联的现有值。

这是一个时髦的解决方案,使用 functools.partialcollections.defaultdict 代替常规字典的 setdefault 方法。

from collections import defaultdict
from functools import partial
from pprint import pprint

results_list = [
    {'consumption': 0.025, 'interval_start': '2021-06-27T00:00:00+01:00', 'interval_end': '2021-06-27T00:30:00+01:00'},
    {'consumption': 0.043, 'interval_start': '2021-06-26T23:30:00+01:00', 'interval_end': '2021-06-27T00:00:00+01:00'},
    {'consumption': 0.053, 'interval_start': '2021-06-26T23:00:00+01:00', 'interval_end': '2021-06-26T23:30:00+01:00'},
    {'consumption': 0.056, 'interval_start': '2021-06-26T22:30:00+01:00', 'interval_end': '2021-06-26T23:00:00+01:00'},
    {'consumption': 0.031, 'interval_start': '2021-06-26T22:00:00+01:00', 'interval_end': '2021-06-26T22:30:00+01:00'},
    {'consumption': 0.129, 'interval_start': '2021-06-26T21:30:00+01:00', 'interval_end': '2021-06-26T22:00:00+01:00'},
    {'consumption': 0.19, 'interval_start': '2021-06-26T21:00:00+01:00', 'interval_end': '2021-06-26T21:30:00+01:00'},
    {'consumption': 0.164, 'interval_start': '2021-06-26T20:30:00+01:00', 'interval_end': '2021-06-26T21:00:00+01:00'},
    {'consumption': 0.145, 'interval_start': '2021-06-26T20:00:00+01:00', 'interval_end': '2021-06-26T20:30:00+01:00'},
    {'consumption': 0.213, 'interval_start': '2021-06-26T19:30:00+01:00', 'interval_end': '2021-06-26T20:00:00+01:00'},
    {'consumption': 0.167, 'interval_start': '2021-06-26T19:00:00+01:00', 'interval_end': '2021-06-26T19:30:00+01:00'},
    {'consumption': 0.333, 'interval_start': '2021-06-26T18:30:00+01:00', 'interval_end': '2021-06-26T19:00:00+01:00'},
    {'consumption': 0.133, 'interval_start': '2021-06-26T18:00:00+01:00', 'interval_end': '2021-06-26T18:30:00+01:00'},
    {'consumption': 0.211, 'interval_start': '2021-06-26T17:30:00+01:00', 'interval_end': '2021-06-26T18:00:00+01:00'},
    {'consumption': 0.135, 'interval_start': '2021-06-26T17:00:00+01:00', 'interval_end': '2021-06-26T17:30:00+01:00'},
    {'consumption': 0.158, 'interval_start': '2021-06-26T16:30:00+01:00', 'interval_end': '2021-06-26T17:00:00+01:00'},
    {'consumption': 0.073, 'interval_start': '2021-06-26T16:00:00+01:00', 'interval_end': '2021-06-26T16:30:00+01:00'},
    {'consumption': 0.077, 'interval_start': '2021-06-26T15:30:00+01:00', 'interval_end': '2021-06-26T16:00:00+01:00'},
    {'consumption': 0.125, 'interval_start': '2021-06-26T15:00:00+01:00', 'interval_end': '2021-06-26T15:30:00+01:00'},
    {'consumption': 0.201, 'interval_start': '2021-06-26T14:30:00+01:00', 'interval_end': '2021-06-26T15:00:00+01:00'},
    {'consumption': 0.043, 'interval_start': '2021-06-26T14:00:00+01:00', 'interval_end': '2021-06-26T14:30:00+01:00'},
]

main_obj = defaultdict(partial(defaultdict, partial(defaultdict, list)))

for i in results_list:    
    
    date = i["interval_start"].split("T")[0].split("-")    
    
    insert_obj = {
        "interval_start" : i['interval_start'],
        "interval_end": i["interval_end"],
        "consumption": i["consumption"]
        
    }

    main_obj[date[0]][date[1]][date[2]].append(insert_obj)

pprint(main_obj)  # Expected result (I think!)

defaultdict 的文档是here,您可以阅读更多关于它如何工作的信息here。实例化 defaultdict 时,传递给创建默认值的函数必须采用 0 个参数,因此有必要使用 functools.partial 来修改此处的函数,以便它们采用比通常更少的参数。 functools.partial 的文档是 here,您可以在此处阅读有关 it 工作原理的更多信息:How does functools partial do what it does?

【讨论】:

    猜你喜欢
    • 2017-05-07
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2019-11-29
    相关资源
    最近更新 更多