【问题标题】:Python: Making a dictionary with items of two columnsPython:制作包含两列项目的字典
【发布时间】:2018-08-14 21:52:13
【问题描述】:

我有一个这样的数据集:

user_id           time_location
  13            (2018-02-02, 190)
  12            (2018-06-02, 194)
  13            (2018-06-02, 194)
  16            (2018-02-02, 190)
  17            (2018-02-02, 190)
  11            (2018-05-02, 198)
  19            (2018-02-02, 190)
  15            (2018-05-02, 198)
  15            (2018-06-02, 194)

我想要的是一个字典列表,其中键是“time_location”列中的项目,值是每个键的 user_id 。这是一个示例输出:

List=[{(2018-02-02, 190): 13, 16,17,19},{(2018-06-02, 194): 12,13,15},{(2018-05-02, 198): 11,15}

谁能帮帮我?

【问题讨论】:

    标签: python list pandas dictionary group-by


    【解决方案1】:

    一种方法是使用collections.defaultdict:

    from collections import defaultdict
    
    df = pd.DataFrame({'user_id': [13, 12, 13, 16, 17, 11, 19, 15, 15],
                       'time_location': [('2018-02-02', 190), ('2018-06-02', 194),
                                         ('2018-06-02', 194), ('2018-02-02', 190),
                                         ('2018-02-02', 190), ('2018-05-02', 198),
                                         ('2018-02-02', 190), ('2018-05-02', 198),
                                         ('2018-06-02', 194)]})
    
    d = defaultdict(list)
    
    for idx, row in df.iterrows():
        d[row['time_location']].append(row['user_id'])
    
    d = [{k: v} for k, v in d.items()]
    
    # [{('2018-02-02', 190): [13, 16, 17, 19]},
    #  {('2018-06-02', 194): [12, 13, 15]},
    #  {('2018-05-02', 198): [11, 15]}]
    

    【讨论】:

      【解决方案2】:

      一种方法是使用df.groupby():

      df = pd.DataFrame({'user_id': [13, 12, 13, 16, 17, 11, 19, 15, 15],
                         'time_location': [('2018-02-02', 190), ('2018-06-02', 194),
                                           ('2018-06-02', 194), ('2018-02-02', 190),
                                           ('2018-02-02', 190), ('2018-05-02', 198),
                                           ('2018-02-02', 190), ('2018-05-02', 198),
                                           ('2018-06-02', 194)]})
      
      d = df.groupby('time_location')['user_id'].apply(list).to_dict()
      d = [{k: v} for k, v in d.items()]
      
      # [{('2018-02-02', 190): [13, 16, 17, 19]},
      #  {('2018-05-02', 198): [11, 15]},
      #  {('2018-06-02', 194): [12, 13, 15]}]
      

      【讨论】:

        【解决方案3】:

        试试这个:

        listOfDict = []
        for pair in your_dataset: ## I'm assuming your data is a list of lists/tuples
            listOfDict.append({pair[1]: pair[0]})
        

        【讨论】:

          【解决方案4】:

          你也可以使用group_by

          df.groupby("time_location")["user_id"].apply(list).to_dict()
          

          【讨论】:

            猜你喜欢
            • 1970-01-01
            • 1970-01-01
            • 1970-01-01
            • 2016-03-21
            • 1970-01-01
            • 1970-01-01
            • 2020-03-21
            • 2020-06-18
            • 2018-09-13
            相关资源
            最近更新 更多