【发布时间】:2021-05-28 10:51:48
【问题描述】:
我首先尝试对数据进行归一化:
df = pd.json_normalize(balance_sheet_data_qt)
然后我尝试使用这个答案将其展平,How to flatten a pandas dataframe with some columns as json? 但似乎没有做任何事情。
json_struct = json.loads(df .to_json(orient="records"))
#df_flat = pd.json_normalize(json_struct)
也尝试了这个How to read and normalize following json in pandas?,但在哪些列上使用assign 时遇到了问题。
在标准化 aka balanace_sheet_data_qt 之前采样数据
{'balanceSheetHistoryQuarterly': {'AAPL': [{'2020-12-26': {'totalLiab': 287830000000, 'totalStockholderEquity': 66224000000, 'otherCurrentLiab': 55899000000, 'totalAssets': 354054000000, 'commonStock': 51744000000, 'otherCurrentAssets': 13687000000, 'retainedEarnings': 14301000000, 'otherLiab': 56042000000, 'treasuryStock': 179000000, 'otherAssets': 43270000000, 'cash': 36010000000, 'totalCurrentLiabilities': 132507000000, 'shortLongTermDebt': 7762000000, 'otherStockholderEquity': 179000000, 'propertyPlantEquipment': 37933000000, 'totalCurrentAssets': 154106000000, 'longTermInvestments': 118745000000, 'netTangibleAssets': 66224000000, 'shortTermInvestments': 40816000000, 'netReceivables': 58620000000, 'longTermDebt': 99281000000, 'inventory': 4973000000, 'accountsPayable': 63846000000}}, {'2020-09-26': {'totalLiab': 258549000000, 'totalStockholderEquity': 65339000000, 'otherCurrentLiab': 47867000000, 'totalAssets': 323888000000, 'commonStock': 50779000000, 'otherCurrentAssets': 11264000000, 'retainedEarnings': 14966000000, 'otherLiab': 46108000000, 'treasuryStock': -406000000, 'otherAssets': 33952000000, 'cash': 38016000000, 'totalCurrentLiabilities': 105392000000, 'shortLongTermDebt': 8773000000, 'otherStockholderEquity': -406000000, 'propertyPlantEquipment': 45336000000, 'totalCurrentAssets': 143713000000, 'longTermInvestments': 100887000000, 'netTangibleAssets': 65339000000, 'shortTermInvestments': 52927000000, 'netReceivables': 37445000000, 'longTermDebt': 98667000000, 'inventory': 4061000000, 'accountsPayable': 42296000000}}, {'2020-06-27': {'totalLiab': 245062000000, 'totalStockholderEquity': 72282000000, 'otherCurrentLiab': 39945000000, 'totalAssets': 317344000000, 'commonStock': 48696000000, 'otherCurrentAssets': 10987000000, 'retainedEarnings': 24136000000, 'otherLiab': 47606000000, 'treasuryStock': -550000000, 'otherAssets': 32836000000, 'cash': 33383000000, 'totalCurrentLiabilities': 95318000000, 'shortLongTermDebt': 7509000000, 'otherStockholderEquity': -550000000, 'propertyPlantEquipment': 43851000000, 'totalCurrentAssets': 140065000000, 'longTermInvestments': 100592000000, 'netTangibleAssets': 72282000000, 'shortTermInvestments': 59642000000, 'netReceivables': 32075000000, 'longTermDebt': 94048000000, 'inventory': 3978000000, 'accountsPayable': 35325000000}}, {'2020-03-28': {'totalLiab': 241975000000, 'totalStockholderEquity': 78425000000, 'otherCurrentLiab': 42048000000, 'totalAssets': 320400000000, 'commonStock': 48032000000, 'otherCurrentAssets': 15691000000, 'retainedEarnings': 33182000000, 'otherLiab': 48745000000, 'treasuryStock': -2789000000, 'otherAssets': 33868000000, 'cash': 40174000000, 'totalCurrentLiabilities': 96094000000, 'shortLongTermDebt': 10392000000, 'otherStockholderEquity': -2789000000, 'propertyPlantEquipment': 43986000000, 'totalCurrentAssets': 143753000000, 'longTermInvestments': 98793000000, 'netTangibleAssets': 78425000000, 'shortTermInvestments': 53877000000, 'netReceivables': 30677000000, 'longTermDebt': 89086000000, 'inventory': 3334000000, 'accountsPayable': 32421000000}}]}}
标准化后的样本数据。
,balanceSheetHistoryQuarterly.AAPL
0,"[{'2020-12-26': {'totalLiab': 287830000000, 'totalStockholderEquity': 66224000000, 'otherCurrentLiab': 55899000000, 'totalAssets': 354054000000,
我想要的列列表:
'totalLiab'
'totalStockholderEquity'
'otherCurrentLiab'
'totalAssets'
'commonStock'
'otherCurrentAssets'
'retainedEarnings'
'otherLiab'
'treasuryStock'
'otherAssets'
'cash'
'totalCurrentLiabilities'
'shortLongTermDebt'
'otherStockholderEquity'
'propertyPlantEquipment'
'totalCurrentAssets'
'propertyPlantEquipment'
'totalCurrentAssets'
'longTermInvestments'
'netTangibleAssets'
'shortTermInvestments'
'netReceivables'
'longTermDebt'
'inventory'
'accountsPayable'
我正在尝试将其转换为数据框/表格格式。我认为第一行 balanceSheetHistoryQuarterly.AAPL 或日期列可能会把它扔掉。
感谢任何帮助。
【问题讨论】:
-
能否请您在规范化之前发布示例数据?
-
好的,我已经添加了
-
我认为您在此问题中的示例数据已损坏,它不是有效的 json。你能提供
balance_sheet_data_qt和你想要的列吗? -
好的,我添加了完整的 balance_sheet_data_qt 和我想要的列。