将 csv 转换为 json/dictionary 并按 ID 分组答案

【问题标题】：converting a csv to json/dictionary and grouping on an ID将 csv 转换为 json/dictionary 并按 ID 分组
【发布时间】：2020-11-20 01:24:43
【问题描述】：

我有一些数据，其中包含一个样本，我想将其转换为 python 和/或 json 格式的字典

您会看到 ID 重复。换句话说，每个 ID 每 15 分钟有一个时间戳值

我正在尝试创建一个嵌套字典，该字典将使用 ID 作为键，使用日期时间/流对作为每个键的值

ID      datetime             flow
762972  01/01/2017 00:00    -1
763753  01/01/2017 00:00    6.00E-05
763776  01/01/2017 00:00    -1
769472  01/01/2017 00:00    0.00132
793144  01/01/2017 00:00    0
799864  01/01/2017 00:00    0
812926  01/01/2017 00:00    0.00108
821553  01/01/2017 00:00    0
829800  01/01/2017 00:00    -1
830174  01/01/2017 00:00    0
762972  01/01/2017 00:15    -1
763753  01/01/2017 00:15    6.00E-05
763776  01/01/2017 00:15    -1
769472  01/01/2017 00:15    0.00048
793144  01/01/2017 00:15    0
799864  01/01/2017 00:15    6.00E-05
812926  01/01/2017 00:15    0.00024
821553  01/01/2017 00:15    0.00012
829800  01/01/2017 00:15    -1
830174  01/01/2017 00:15    0
762972  01/01/2017 00:30    -1
763753  01/01/2017 00:30    6.00E-05
763776  01/01/2017 00:30    -1
769472  01/01/2017 00:30    0.0006
793144  01/01/2017 00:30    0
799864  01/01/2017 00:30    0
812926  01/01/2017 00:30    0
821553  01/01/2017 00:30    0
829800  01/01/2017 00:30    -1
830174  01/01/2017 00:30    0

我正在尝试采用这种格式，我相信每个 ID 都会将与之关联的每个日期时间/流记录存储在嵌套字典中

任何人都可以提供任何建议。我一直在尝试先对 ID 上的原始 .csv 进行排序，然后使用 groupby 函数，但到目前为止没有成功

谢谢

nested_dict = { '762972': [{'datetime': '01/01/2017 00:00', 'flow': '-1'}, {'datetime': '01/01/2017      00:15', 'flow': '-1'}, {'datetime': '01/01/2017 00:30', 'flow': '-1'}],  
                '763753': [{'datetime': '01/01/2017 00:00', 'flow': '6.00E-05'}, {'datetime': '01/01/2017 00:15', 'flow': '6.00E-05'}, {'datetime': '01/01/2017 00:30', 'flow': '6.00E-05'}] 
                }

【问题讨论】：

标签： python json csv dictionary pandas-groupby

【解决方案1】：

如果您按 id 分组，则可以Iterate through the groups 并构建您的字典。然后，您可以使用各自的 python 模块将字典导出为 json 或 csv。

类似：

output = {}
for name, group in df.groupby('ID'):
    try:
        output[name].append(group)
    except KeyError:
        output[name] = [group]
json.dumps(output) # Your json

可能有一种仅限 pandas 的方法，但我不知道。

【讨论】：

谢谢。我正朝着类似的方向前进，但没有解决它。但是，当我尝试您的解决方案时，我收到此错误 --> 'TypeError: Object of type 'DataFrame' is not JSON serializable' 这似乎表明它不喜欢使用数据帧？