【问题标题】:KeyError PandasKeyError 熊猫
【发布时间】:2021-05-23 10:59:37
【问题描述】:

我正在尝试使用 Pandas 的 json_normalize 方法读取嵌套的 JSON。我正在尝试将其中一个字段用作record_path。我还包括了errors = 'ignore' 以忽略由于缺少密钥而导致的任何错误。你能帮我解决我在这里做错了什么吗?

这是 JSON -

{
    "_id" : "31aa9894-6a43-40f9-8911-116c14c42636",
    "message" : {
        "serviceOperationName" : "/logUserEvents/event",
        "accountNumber" : "1234",
        "userId" : null,
        "market" : null,
        "extract" : {
            "request" : {
                "USER_EVENT_LOGGING" : {
                    "payload" : [ 
                        {
                            "eventType" : "audibleSummaryUsage",
                            "ntid" : "abc",
                            "accountNumber" : "Not Found",
                            "workOrderNumber" : "",
                            "data" : [ 
                                {
                                    "name" : "userAction",
                                    "value" : "DISMISSED"
                                }, 
                                {
                                    "name" : "employeeTenure",
                                    "value" : "3.9"
                                }, 
                                {
                                    "name" : "ffc",
                                    "value" : "1234"
                                }, 
                                {
                                    "name" : "ntid",
                                    "value" : "abcd"
                                }, 
                                {
                                    "name" : "isAccountView",
                                    "value" : "true"
                                }, 
                                {
                                    "name" : "userAction",
                                    "value" : "DISMISSED"
                                }, 
                                {
                                    "name" : "title",
                                    "value" : "abcd"
                                }, 
                                {
                                    "name" : "jobType",
                                    "value" : ""
                                }, 
                                {
                                    "name" : "jobClassCd",
                                    "value" : ""
                                }
                            ]
                        }
                    ]
                }
            },
            "response" : {}
        },
        "@timestamp" : "2021-02-18T05:38:48.00269Z",
        "eventKeys" : [ 
            "USER_EVENT_LOGGING"
        ],
        "requestStartTimestampText" : "2021-02-18T05:38:48.268Z"
    },
    "createdOn" : ISODate("2021-02-18T05:38:48.269Z")
}

/* 2 */
{
    "_id" : "4189da82-299d-4a9e-8f10-ddb5da9b97b5",
    "message" : {
        "serviceOperationName" : "/logUserEvents/event",
        "accountNumber" : "7890",
        "userId" : null,
        "market" : null,
        "extract" : {
            "request" : {
                "USER_EVENT_LOGGING" : {
                    "payload" : [ 
                        {
                            "eventType" : "audibleSummaryUsage",
                            "ntid" : "defg",
                            "accountNumber" : "Not Found",
                            "workOrderNumber" : "",
                            "data" : [ 
                                {
                                    "name" : "userAction",
                                    "value" : "DISMISSED"
                                }, 
                                {
                                    "name" : "userAction",
                                    "value" : "DISMISSED"
                                }, 
                                {
                                    "name" : "employeeTenure",
                                    "value" : "3.9"
                                }, 
                                {
                                    "name" : "jobType",
                                    "value" : ""
                                }, 
                                {
                                    "name" : "jobClassCd",
                                    "value" : ""
                                }, 
                                {
                                    "name" : "ntid",
                                    "value" : "dfer"
                                }, 
                                {
                                    "name" : "ffc",
                                    "value" : "3456"
                                }, 
                                {
                                    "name" : "title",
                                    "value" : "erty"
                                }, 
                                {
                                    "name" : "isAccountView",
                                    "value" : "true"
                                }
                            ]
                        }
                    ]
                }
            },
            "response" : {}
        },
        "@timestamp" : "2021-02-18T05:39:11.00659Z",
        "eventKeys" : [ 
            "USER_EVENT_LOGGING"
        ],
        "requestStartTimestampText" : "2021-02-18T05:39:11.658Z"
    },
    "createdOn" : ISODate("2021-02-18T05:39:11.659Z")
}

这里是代码-

db = mongo_client.conciselogs
col = db.logs
cursor = col.find({"message.extract.request.USER_EVENT_LOGGING.payload.eventType":"audibleSummaryUsage"})
mongo_docs = list(cursor)
df = pd.json_normalize(mongo_docs, ['message.extract.request.USER_EVENT_LOGGING.payload.data'], errors = 'ignore')
df.to_csv('sample_data0220_3.csv', index=False)```

【问题讨论】:

    标签: python json pandas mongodb


    【解决方案1】:

    您的record_path 参数不正确,应该是一个列表:

    df = pd.json_normalize(
        mongo_docs,
        ['message', 'extract', 'request', 'USER_EVENT_LOGGING', 'payload', 'data'], # list, not 'key.key.key'
        errors='ignore',
    )
    
    df.to_csv('sample_data0220_3.csv', index=False)
    

    输出:

    name,value
    userAction,DISMISSED
    employeeTenure,3.9
    ffc,1234
    ntid,abcd
    isAccountView,true
    userAction,DISMISSED
    title,abcd
    jobType,
    jobClassCd,
    userAction,DISMISSED
    userAction,DISMISSED
    employeeTenure,3.9
    jobType,
    jobClassCd,
    ntid,dfer
    ffc,3456
    title,erty
    isAccountView,true
    

    【讨论】:

      猜你喜欢
      • 2018-03-16
      • 2020-02-02
      • 2021-01-18
      • 1970-01-01
      • 2020-05-28
      • 2019-12-20
      • 1970-01-01
      • 1970-01-01
      • 2023-03-16
      相关资源
      最近更新 更多