【发布时间】:2017-03-25 13:27:16
【问题描述】:
我们需要将存储在 Azure 数据湖存储中的一些大文件从嵌套的 JSON 转换为 CSV。由于除了标准模块之外,Azure 数据湖分析还支持 python 模块 pandas、numpy,我相信使用 python 实现这一目标几乎是可能的。有没有人有python代码来实现这个?
源格式:
{"Loc":"TDM","Topic":"location","LocMac":"location/fe:7a:xx:xx:xx:xx","seq":"296083773","timestamp ":1488986751,"op":"OP_UPDATE","topicSeq":"46478211","sourceId":"AFBWmHSe","location":{"staEthMac":{"addr":"/xxxxx"},"staLocationX ":1643.8915,"staLocationY":571.04205,"errorLevel":1076,"associated":0,"campusId":"n5THo6IINuOSVZ/cTidNVA==","buildingId":"7hY/xx==","floorId": "xxxxxxxxxxx+BYoo0A==","hashedStaEthMac":"xxxx/pMVyK4Gu9qG6w=","locAlgorithm":"ALGORITHM_ESTIMATION","unit":"FEET"},"EventProcessedUtcTime":"2017-03-08T15:35:02.3847947 Z","PartitionId":3,"EventEnqueuedUtcTime":"2017-03-08T15:35:03.7510000Z","IoTHub":{"MessageId":null,"CorrelationId":null,"ConnectionDeviceId":"xxxxx" ,"ConnectionDeviceGenerationId":"636243184116591838","EnqueuedTime":"0001-01-01T00:00:00.0000000","StreamId":null}}
预期输出
TDM,位置,位置/80:7a:bf:d4:d6:50,974851970,1490004475,OP_UPDATE,151002334,xxxxxxx,gHq/1NZQ,977.7259,638.8827,490,1,n5THo6IINuOSVZ/cTidNVA==,7hY /jVh9NRqqxF6gbqT7Jw==,LV/ZiQRQMS2wwKiKTvYNBQ==,H5rrAD/jg1Fnkmo1Zmquau/Qn1U=,ALGORITHM_ESTIMATION,英尺
【问题讨论】:
标签: python json csv azure azure-data-lake