【发布时间】:2021-03-20 21:58:28
【问题描述】:
我有以下 json 文件:
{'transactionDetail': {'transactionID': 'rrt-0a75e3331e9d4a100-b-se-17175-7612138-13_1571',
'transactionTimestamp': '2020-11-22T07:22:14.346Z', 'inLanguage': 'en-US', 'productID': 'aasmcu',
'productVersion': '1'}, 'inquiryDetail': {'productVersion': 'v1', 'productID': 'aasmcu', 'duns':
'6979900'}, 'organization': {'duns': '006979900', 'dunsControlStatus': {'operatingStatus':
{'description': 'Active', 'dnbCode': 9074}}, 'primaryName': 'American Express Company',
'isStandalone': False, 'primaryAddress': {'language': {}, 'addressCountry': {'name': 'United States',
'isoAlpha2Code': 'US'}, 'continentalRegion': {'name': 'North America'}, 'addressLocality': {'name':
'New York'}, 'minorTownName': None, 'addressRegion': {'name': 'New York', 'abbreviatedName': 'NY'},
'addressCounty': {'name': 'New York'}, 'postalCode': '10285-0002', 'postalCodePosition': {},
'streetNumber': None, 'streetName': None, 'streetAddress': {'line1': '200 Vesey St FL 50', 'line2':
None}, 'postOfficeBox': {}}, 'corporateLinkage': {'familytreeRolesPlayed': [{'description': 'Global
Ultimate', 'dnbCode': 12775}, {'description': 'Domestic Ultimate', 'dnbCode': 12774}, {'description':
'Parent/Headquarters', 'dnbCode': 9141}], 'hierarchyLevel': 1,
'globalUltimateFamilyTreeMembersCount': 1686}, 'dnbAssessment': {'materialChange': {'riskSegment':
{'description': 'No Change of High Probability Risk Profile', 'dnbCode': 30686},
'organizationSizeSegment': {'description': 'Business Profile Decay', 'dnbCode': 30671},
'borrowingSegment': {'description': 'Business Profile Stable', 'dnbCode': 30670}, 'spendSegment':
{'description': 'Business Profile Stable', 'dnbCode': 30670}, 'opportunityFinalSegment':
{'description': 'Stable Business', 'dnbCode': 30681}}, 'triplePlay': {'compositeRiskScore': 5,
'riskSegment': {'description': 'Promote Acqusition Targets', 'dnbCode': 30668}}}}}
{'transactionDetail': {'transactionID': 'rrt-04b146343b2275455-a-se-17594-7595335-2_1570',
'transactionTimestamp': '2020-11-22T07:22:15.115Z', 'inLanguage': 'en-US', 'productID': 'aasmcu',
'productVersion': '1'}, 'inquiryDetail': {'productVersion': 'v1', 'productID': 'aasmcu', 'duns':
'5070479'}, 'organization': {'duns': '005070479', 'dunsControlStatus': {'operatingStatus':
{'description': 'Active', 'dnbCode': 9074}}, 'primaryName': 'Caterpillar Inc.', 'isStandalone':
False, 'primaryAddress': {'language': {}, 'addressCountry': {'name': 'United States',
'isoAlpha2Code': 'US'}, 'continentalRegion': {'name': 'North America'}, 'addressLocality': {'name':
'Deerfield'}, 'minorTownName': None, 'addressRegion': {'name': 'Illinois', 'abbreviatedName': 'IL'},
'addressCounty': {'name': 'Lake'}, 'postalCode': '60015-5031', 'postalCodePosition': {},
'streetNumber': None, 'streetName': None, 'streetAddress': {'line1': '510 Lake Cook Rd Ste 100',
'line2': None}, 'postOfficeBox': {}}, 'corporateLinkage': {'familytreeRolesPlayed': [{'description':
'Global Ultimate', 'dnbCode': 12775}, {'description': 'Domestic Ultimate', 'dnbCode': 12774},
{'description': 'Parent/Headquarters', 'dnbCode': 9141}], 'hierarchyLevel': 1,
'globalUltimateFamilyTreeMembersCount': 1095}, 'dnbAssessment': {'materialChange': {'riskSegment':
{'description': 'High Probability of Improvement in Risk Profile', 'dnbCode': 30682},
'organizationSizeSegment': {'description': 'Business Profile Decay', 'dnbCode': 30671},
'borrowingSegment': {'description': 'Business Profile Decay', 'dnbCode': 30671}, 'spendSegment':
{'description': 'Business Profile Decay', 'dnbCode': 30671}, 'opportunityFinalSegment':
{'description': 'Decrease In Scale', 'dnbCode': 30680}}, 'triplePlay': {'compositeRiskScore': 6,
'riskSegment': {'description': 'Promote Acqusition Targets', 'dnbCode': 30668}}}}}
我需要做的是规范化 json 文件。在上面的示例中,我们有 2 家公司,但文件有 1000 个。如果我只有这样一家公司,我可以展平 json 文件:
with open('Material_Change_20201122.json') as f:
d = json.load(f)
first = d[0]
transaction_detail = json_normalize(first['transactionDetail'])
transaction_detail.rename(columns={'transactionID': 'record_id'}, inplace=True)
但是当添加超过 1 家公司时我遇到的问题是我需要创建一个 for loop 来遍历 json 并将每个公司附加到 DF 的新行。我的逻辑如下:
small_d= d[0:5]
transaction_detail_1 = pd.DataFrame()
for i in small_d:
temp_df = json_normalize(i['transactionDetail'])
temp_df.rename(columns={'transactionID': 'record_id'}, inplace=True)
transaction_detail_1['record_id'].append(temp_df['record_id'])
但是当我运行它时,我得到一个错误KeyError: 'record_id'。我需要自动化的原因是因为我必须对几个 json 文件应用相同类型的逻辑,其中一些文件一旦展平就有 100 列。
谢谢!
【问题讨论】:
标签: python json python-3.x loops