【发布时间】:2022-01-23 21:18:32
【问题描述】:
我有一个包含嵌套对象的 json 文件,该文件在 pandas 数据框中展平。有一个包含嵌套 json 对象的列,我觉得很难展平。
我尝试了很多方法,这是让我走得最远的方法。
非常感谢您的帮助,谢谢。
不幸的是,我无法找到类似 jsfiddle 的 Python 替代品来提供工作示例。
我知道使用 json_normalize 的元参数,我可以将列添加到我的数据框。但是这种方法不适用于 unflat 列,因为我只有通过将 record_path 设置为 'markets' 来让 json_normalize 在我的设置中正常工作,这是我文件中的主要 json 对象。因此,在此设置中,我无法将记录路径记录到“marketStats”并通过元参数添加任何相关列。
目标
目标是将 marketStats 对象中的一个或所有 json 对象转换为数据框的列。
代码
with open('Data/20012022.json') as file:
data = json.loads(file.read())
# Flatten data
df0 = pd.json_normalize(
data,
record_path =['markets']
)
df0.head(3)
截图
这是当前表格的截图,marketStats 列包含嵌套的 json。
数据
这是来自 json 文件的 sn-p。 `
{
"markets": [
{
"id": 335,
"baseCurrency": "eth",
"quoteCurrency": "btc",
"exchangeName": "Binance",
"exchangeCode": "BINA",
"longName": "BTC-ETH",
"marketName": "btc-eth",
"symbol": "ETHBTC",
"volume": "40624.5823",
"quoteVolume": "3026.13646935",
"btcVolume": "3026.13646935",
"usdVolume": "127009429.050524367",
"currentPrice": 0.074681,
"latestBase": {
"id": 161774475,
"time": 1639576800,
"date": "2021-12-15T14:00:00.000+00:00",
"price": "0.077653",
"lowestPrice": "0.0729",
"bounce": "6.283",
"currentDrop": "-3.8272829124438206",
"crackedAt": "2022-01-07T03:00:00.000Z",
"respectedAt": "2022-01-15T15:00:00.000Z",
"isLowest": false
},
"marketStats": [
{
"algorithm": "original",
"ratio": "50.0",
"medianDrop": "-4.08",
"medianBounce": "5.51",
"hoursToRespected": 106,
"crackedCount": 2,
"respectedCount": 1
},
{
"algorithm": "day_trade",
"ratio": "100.0",
"medianDrop": "-6.12",
"medianBounce": "6.28",
"hoursToRespected": 204,
"crackedCount": 1,
"respectedCount": 1
},
{
"algorithm": "conservative",
"ratio": "100.0",
"medianDrop": "-6.12",
"medianBounce": "8.38",
"hoursToRespected": 204,
"crackedCount": 1,
"respectedCount": 1
},
{
"algorithm": "position",
"ratio": "50.0",
"medianDrop": "-6.12",
"medianBounce": "6.19",
"hoursToRespected": 204,
"crackedCount": 2,
"respectedCount": 1
},
{
"algorithm": "hodloo",
"ratio": "50.0",
"medianDrop": "-3.29",
"medianBounce": "0.0",
"hoursToRespected": 225,
"crackedCount": 4,
"respectedCount": 2
}
]
},
{
"id": 337,
"baseCurrency": "ltc",
"quoteCurrency": "btc",
"exchangeName": "Binance",
"exchangeCode": "BINA",
"longName": "BTC-LTC",
"marketName": "btc-ltc",
"symbol": "LTCBTC",
"volume": "68309.637",
"quoteVolume": "223.79294524",
"btcVolume": "223.79294524",
"usdVolume": "9392773.4219378968",
"currentPrice": 0.003275,
"latestBase": {
"id": 163982984,
"time": 1642374000,
"date": "2022-01-16T23:00:00.000+00:00",
"price": "0.003346",
"lowestPrice": "0.00322",
"bounce": "3.839",
"currentDrop": "-2.1219366407650926",
"crackedAt": "2022-01-18T23:00:00.000Z",
"respectedAt": null,
"isLowest": false
},
"marketStats": [
{
"algorithm": "original",
"ratio": "57.14",
"medianDrop": "-3.28",
"medianBounce": "3.84",
"hoursToRespected": 186,
"crackedCount": 7,
"respectedCount": 4
},
{
"algorithm": "day_trade",
"ratio": "0.0",
"medianDrop": "0.0",
"medianBounce": "5.68",
"hoursToRespected": 0,
"crackedCount": 1,
"respectedCount": 0
},
{
"algorithm": "conservative",
"ratio": "0.0",
"medianDrop": "0.0",
"medianBounce": "5.68",
"hoursToRespected": 0,
"crackedCount": 1,
"respectedCount": 0
},
{
"algorithm": "position",
"ratio": "0.0",
"medianDrop": "0.0",
"medianBounce": "8.16",
"hoursToRespected": 0,
"crackedCount": 1,
"respectedCount": 0
},
{
"algorithm": "hodloo",
"ratio": "75.0",
"medianDrop": "-3.7",
"medianBounce": "0.0",
"hoursToRespected": 35,
"crackedCount": 4,
"respectedCount": 3
}
]
},
{
"id": 339,
"baseCurrency": "bnb",
"quoteCurrency": "btc",
"exchangeName": "Binance",
"exchangeCode": "BINA",
"longName": "BTC-BNB",
"marketName": "btc-bnb",
"symbol": "BNBBTC",
"volume": "154576.177",
"quoteVolume": "1724.66664804",
"btcVolume": "1724.66664804",
"usdVolume": "72385673.4448901928",
"currentPrice": 0.01099,
"latestBase": {
"id": 163753765,
"time": 1642068000,
"date": "2022-01-13T10:00:00.000+00:00",
"price": "0.01093",
"lowestPrice": "0.01093",
"bounce": "3.102",
"currentDrop": "0.5489478499542543",
"crackedAt": null,
"respectedAt": null,
"isLowest": false
},
"marketStats": [
{
"algorithm": "original",
"ratio": "100.0",
"medianDrop": "-7.18",
"medianBounce": "4.34",
"hoursToRespected": 62,
"crackedCount": 2,
"respectedCount": 2
},
{
"algorithm": "day_trade",
"ratio": "100.0",
"medianDrop": "-6.19",
"medianBounce": "4.3",
"hoursToRespected": 63,
"crackedCount": 1,
"respectedCount": 1
},
{
"algorithm": "conservative",
"ratio": "66.67",
"medianDrop": "-3.15",
"medianBounce": "4.05",
"hoursToRespected": 62,
"crackedCount": 3,
"respectedCount": 2
},
{
"algorithm": "position",
"ratio": "100.0",
"medianDrop": "-3.15",
"medianBounce": "4.46",
"hoursToRespected": 60,
"crackedCount": 2,
"respectedCount": 2
},
{
"algorithm": "hodloo",
"ratio": "100.0",
"medianDrop": "-7.46",
"medianBounce": "0.0",
"hoursToRespected": 62,
"crackedCount": 5,
"respectedCount": 5
}
]
}
]
}
【问题讨论】:
标签: python json pandas dictionary json-normalize