【发布时间】:2019-01-22 19:57:13
【问题描述】:
我正在尝试将多个 json 文件从我的 Google Drive 中的一个目录加载到一个 pandas 数据帧中。
我尝试了很多解决方案,但似乎都没有产生积极的结果。
这是我迄今为止尝试过的
path_to_json = '/path/'
json_files = [pos_json for pos_json in os.listdir(path_to_json) if pos_json.endswith('.json')]
jsons_data = pd.DataFrame(columns=['participants','messages','active','threadtype','thread path'])
for index, js in enumerate(json_files):
with open(os.path.join(path_to_json, js)) as json_file:
json_text = json.load(json_file)
participants = json_text['participants']
messages = json_text['messages']
active = json_text['is_still_participant']
threadtype = json_text['thread_type']
threadpath = json_text['thread_path']
jsons_data.loc[index]=[participants,messages,active,threadtype,threadpath]
jsons_data
这是我收到的错误消息的完整追溯:
---------------------------------------------------------------------------
JSONDecodeError Traceback (most recent call last)
<ipython-input-30-8385abf6a3a7> in <module>()
1 for index, js in enumerate(json_files):
2 with open(os.path.join(path_to_json, js)) as json_file:
----> 3 json_text = json.load(json_file)
4 participants = json_text['participants']
5 messages = json_text['messages']
/usr/lib/python3.6/json/__init__.py in load(fp, cls, object_hook, parse_float, parse_int, parse_constant, object_pairs_hook, **kw)
297 cls=cls, object_hook=object_hook,
298 parse_float=parse_float, parse_int=parse_int,
--> 299 parse_constant=parse_constant, object_pairs_hook=object_pairs_hook, **kw)
300
301
/usr/lib/python3.6/json/__init__.py in loads(s, encoding, cls, object_hook, parse_float, parse_int, parse_constant, object_pairs_hook, **kw)
352 parse_int is None and parse_float is None and
353 parse_constant is None and object_pairs_hook is None and not kw):
--> 354 return _default_decoder.decode(s)
355 if cls is None:
356 cls = JSONDecoder
/usr/lib/python3.6/json/decoder.py in decode(self, s, _w)
337
338 """
--> 339 obj, end = self.raw_decode(s, idx=_w(s, 0).end())
340 end = _w(s, end).end()
341 if end != len(s):
/usr/lib/python3.6/json/decoder.py in raw_decode(self, s, idx)
355 obj, end = self.scan_once(s, idx)
356 except StopIteration as err:
--> 357 raise JSONDecodeError("Expecting value", s, err.value) from None
358 return obj, end
JSONDecodeError: Expecting value: line 1 column 1 (char 0)
我添加了一个我试图从中读取的 json 文件的示例
json 示例:
{
participants: [
{
name: "Test 1"
},
{
name: "Person"
}
],
messages: [
{
sender_name: "Person",
timestamp_ms: 1485467319139,
content: "Hie",
type: "Generic"
}
],
title: "Test 1",
is_still_participant: true,
thread_type: "Regular",
thread_path: "inbox/xyz"
}
#second example
{
participants: [
{
name: "Clearance"
},
{
name: "Person"
}
],
messages: [
{
sender_name: "Emmanuel Sibanda",
timestamp_ms: 1212242073308,
content: "Dear",
share: {
link: "http://www.example.com/"
},
type: "Share"
}
],
title: "Clearance",
is_still_participant: true,
thread_type: "Regular",
thread_path: "inbox/Clearance"
}
【问题讨论】:
-
您可以编辑您的问题并显示文件中的示例 JSON,而不是添加指向 JSON 文件的链接。
-
@amanb 添加了 2 个 JSON 文件示例
-
这些是格式错误的 JSONS。键不是字符串,一些值也不是字符串,最终可能会出错。您应该找到一种使用JSON validator 验证它们的方法。
-
您能否分享您希望看到的数据框的示例输出?你期待什么形状?一个示例 JSONS 的某种表格表示应该是理想的!