嵌套字典的迭代器类答案

【问题标题】：Iterator-class for nested dictionaries嵌套字典的迭代器类
【发布时间】：2019-10-17 01:24:30
【问题描述】：

初步情况

假设我们有一个字典，以以下形式存储时间序列数据：

dic = {'M15': 
        { 
            '100001': { 0: [0,1,2,...],
                        1: [0,1,2,...]
                    },
            '100002': { 0: [0,1,2,...],
                        1: [0,1,2,...]
                    },
                    ...
        },
        'H1': {
            '200001': { 0: [0,1,2,...],
                        1: [0,1,2,...]
                    },
            ...
        },
        ...
}

现在，假设这个字典存储在一个名为 data 的类中，如下所示：

class data:

    def __init__(self, input: dict):
        self.data = input

newData = data(dic)

很明显，该类应存储时间序列数据并在迭代中将其返回，以便在某个时间点进行进一步处理。

我的问题

我想让类可迭代，这意味着__next__ 将遍历字典中的所有数据（即将提出的问题不是关于如何迭代嵌套字典，所以请不要回答这个问题）。数据意味着我只需要字典中最低级别的数组，例如[0,1,2,...].

让我们假设字典中的数据非常庞大——它可以放在内存中，但不能重复。因此，据我所知，列表推导不是一个选项，因为除了字典之外，数据也将存储在这个新列表中（仍然需要字典，并且在此示例中数组不是选项）。为了完整起见，这看起来像：

class data:
    def __init__(self, input: dict):
        self.dictionary = input
        self.data  = [series_array for series_key, series_array in series.items() for ... in self.dictionary.items()]
        self.index = 0
    def __iter__(self):
        return self
    def __next__(self):
        self.index += 1
        return self.data[self.index - 1]

问题 1：

列表理解是否只指向字典还是真的会复制数据？

这意味着我必须对字典使用正常的迭代，但我想不出在__iter__和__next__中实现这一点的方法。

问题 2：

如何在__iter__和__next__中实现这个嵌套的字典循环？

请注意，我正在寻找这个具体问题的答案，而不是“为什么不使用生成器”或“为什么不这样做/那样做”。

【问题讨论】：

I only need the arrays at the lowest level within the dictionary，为什么不只存储最低层而不是嵌套字典？
如上所述，"the dictionary is still needed and an array is not an option in this example"。除了返回所有时间序列外，我还必须能够访问具体序列（如dic['M15']['100002'][0]）。

标签： python class dictionary iterator nested-loops

【解决方案1】：

问题一：

Would the list comprehension just point to the data within the dictionary or would it really copy the data?

它将保存对字典中列表的引用

问题 2：

How would I implement this nested dictionary-loop within __iter__and __next__?

您只需要在__iter__ 中返回一个迭代器（而不是例如列表），在这种情况下，列表中的生成器表达式就足够了：

class Data:
    def __init__(self, input: dict):
        self.dictionary = input
    def __iter__(self):
        return (series_array for series_key, series_array in series.items() for ... in self.dictionary.items())

【讨论】：

感谢您的快速回复！我太专注于迭代器，以至于忘记了用生成器为迭代器提供数据的可能性...... :)