【发布时间】:2018-09-18 14:01:04
【问题描述】:
我有一个非常大的 JSON 对象,我需要将其拆分为较小的对象并将这些较小的对象写入文件。
样本数据
raw = '[{"id":"1","num":"2182","count":-17}{"id":"111","num":"3182","count":-202}{"id":"222","num":"4182","count":12},{"id":"33333","num":"5182","count":12}]'
期望的输出(在本例中,将数据分成两半)
output_file1.json = [{"id":"1","num":"2182","count":-17},{"id":"111","num":"3182","count":-202}]
output_file2.json = [{"id":"222","num":"4182","count":12}{"id":"33333","num":"5182","count":12}]
当前代码
import pandas as pd
import itertools
import json
from itertools import zip_longest
def grouper(iterable, n, fillvalue=None):
args = [iter(iterable)] * n
return zip_longest(fillvalue=fillvalue, *args)
raw = '[{"id":"1","num":"2182","count":-17}{"id":"111","num":"3182","count":-202}{"id":"222","num":"4182","count":12},{"id":"33333","num":"5182","count":12}]'
#split the data into manageable chunks + write to files
for i, group in enumerate(grouper(raw, 4)):
with open('outputbatch_{}.json'.format(i), 'w') as outputfile:
json.dump(list(group), outputfile)
第一个文件“outputbatch_0.json”的当前输出
["[", "{", "\"", "s"]
我觉得我做的比它需要的要困难得多。
【问题讨论】:
-
您的
raw字符串不是有效的 JSON(对象之间缺少逗号)。您的真实数据是这种情况还是只是问题中的错字?
标签: python json python-3.x