【发布时间】:2017-06-08 11:07:11
【问题描述】:
我是 pandas 和 python 的新手,无法解决这个问题。我有一个复杂的 nexted json 文件,我想将其加载到 pandas 数据框中。
我正在使用以下代码:
import json
import urllib.request
import pandas as pd
import numpy as np
from pandas.io.json import json_normalize
file_str = 'C:\\file.json'
with open(file_str, 'r', encoding="utf-8") as json_file:
json_work = pd.read_json(json_file, typ='series', orient='columns')
for k, v in json_work.items():
if v is None:
json_work[k] = "N/A"
##df = pd.DataFrame.from_dict(json_work)
df = pd.io.json.json_normalize(json_work)
print(df)
正如它所写的,我收到了这个错误:
Traceback (most recent call last):
File "C:/.....hack.py", line 18, in <module>
df = pd.io.json.json_normalize(json_work)
File "C:\Users\scoe\AppData\Local\Continuum\Anaconda3\lib\site-packages\pandas\io\json.py", line 708, in json_normalize
if any([isinstance(x, dict) for x in compat.itervalues(data[0])]):
File "C:\Users\scoe\AppData\Local\Continuum\Anaconda3\lib\site-packages\pandas\compat\__init__.py", line 175, in itervalues
return iter(obj.values(**kw))
AttributeError: 'str' object has no attribute 'values'
if I swap these two lines to read
df = pd.DataFrame.from_dict(json_work)
##df = pd.io.json.json_normalize(json_work)
进程成功运行,但结果看起来不像数据框。输出显示如下:
---- more lines above this, its a sample of the middle of the output ----
hrCenterName KW App Development & Maint
hrSignatureLevel 1H
hrSignatureLevelTitle Level 1 HR Signature Authority
imName @
imProvider N/A
... ...
primaryOfficePhoneExtension N/A
---- more lines after this ----
我做错了什么?
【问题讨论】:
-
你能发布一个输出:
print(type(json_work))吗? -
-
您可以用简单的
json_work.fillna("N/A")替换循环for k, v in json_work.items(): if v is None: json_work[k] = "N/A"。 但你甚至不想这样做,如果你想要的只是格式化它以便在打印时很好地呈现它,只需use thedf.to_string(... formattersto define custom string-formatting,而不会不必要地浪费内存。
标签: python json pandas numpy dataframe