【问题标题】:Python to aggregating data from jsonPython从json聚合数据
【发布时间】:2020-05-24 21:55:39
【问题描述】:

我正在尝试按“用户”键聚合 Json 数据,总结“排名”中的值并附加“链接”,这是 Json 数据的示例:

    data = [{'node': {"user": 12345, "rank": 10, "text":'random long string xxxx',"link": www.link.com/useraaaa},{'node': {"user": 23456, "rank": 20, "text":'random long string yyyy',"link": www.link.com/usercccc},{'node': {"user": 23456, "rank": 5, "text":'a very long string zzzz',"link": www.link.com/userdddd},{'node': {"user": 12345, "rank": 20, "text":'a very long string jjjj',"link": www.link.com/userbbbb}];

我尝试遍历数据并将“用户”提取到列表中,然后使用 if else 语句再次遍历数据以检查列表中的用户是否将我需要的数据附加到用户,但我认为效率不高。 我正在尝试获得以下结果,有什么建议吗?

agg_data = [{"user": 12345, "rank": 30, "text":['random long string xxxx','a very long string jjjj'], "link": [www.link.com/useraaaa, www.link.com/userbbbb]},{"user": 23456, "rank": 25, "text":['random long string yyyy','a very long string zzzz'], "link": [www.link.com/usercccc, www.link.com/userdddd]}];

【问题讨论】:

    标签: python-3.x aggregation


    【解决方案1】:

    也许它的效率很低,但这解决了问题,这是一个按使用分组的好任务:

    from itertools import groupby
    
    data = [
        {'node': {"user": 12345, "rank": 10, "text": 'random long string xxxx', "link": "www.link.com/useraaaa"}},
        {'node': {"user": 23456, "rank": 20, "text": 'random long string yyyy', "link": "www.link.com/usercccc"}},
        {'node': {"user": 23456, "rank": 5, "text": 'a very long string zzzz', "link": "www.link.com/userdddd"}},
        {'node': {"user": 12345, "rank": 20, "text": 'a very long string jjjj', "link": "www.link.com/userbbbb"}}
    ]
    agg_data = [
        {"user": 12345, "rank": 30, "text": ['random long string xxxx', 'a very long string jjjj'],
         "link": ["www.link.com/useraaaa", "www.link.com/userbbbb"]},
        {"user": 23456, "rank": 25, "text": ['random long string yyyy', 'a very long string zzzz'],
         "link": ["www.link.com/usercccc", "www.link.com/userdddd"]}
    ]
    
    
    def sum_up_keys(alist):
        res = {}
        for obj in alist:
            obj = obj['node']
            obj.pop('user')
            for k, v in obj.items():
                if isinstance(v, int):
                    if k not in res:
                        res[k] = 0
                    res[k] += v
                elif isinstance(v, str):
                    if k not in res:
                        res[k] = []
                    res[k].append(v)
        return res
    
    
    keyfunc = lambda x: x['node']['user']
    group = groupby(sorted(data, key=keyfunc), keyfunc)
    
    objs = []
    for k, v in group:
        obj = {'user': k, **sum_up_keys(list(v))}
        objs.append(obj)
    
    print(objs)
    

    【讨论】:

      猜你喜欢
      • 2018-06-19
      • 1970-01-01
      • 2016-08-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 2014-02-07
      相关资源
      最近更新 更多