【问题标题】:How to merge two data frames with repeated values in pandas如何在熊猫中合并两个具有重复值的数据框
【发布时间】:2015-12-03 02:58:26
【问题描述】:

我在 pandas 中有两个数据框:

  dilevery_time   dispatch_time  source_lat  source_long  Address   name
0 21:39:37.265    21:47:37.265   -73.955741    40.3422     Dmart    John
0 21:39:37.265    21:47:37.265   -73.955741    40.3422     Dmart    John

还有一个是:

  chef_name   dish_name   dish_price   dish_quantity   ratings
0   xyz        Chicken      120            1             4
1   abc        Paneer       100            2             3 

我想在 pandas 中加入这两个数据框。我已经执行了连接,但它不允许我执行,因为第一个数据帧有重复的值。

所以,我这样做了:

pd.concat([df1, df2], join='inner', axis=1)

但这给了我以下输出:

   dilevery_time  dispatch_time  source_long   Address  name  chef_name  
0  21:39:37.265   21:47:37.265    -73.955741    Dmart   John   xyz
0  21:39:37.265   21:47:37.265    -73.955741    Dmart   John   xyz

  dish_name   dish_price    dish_quantity    ratings
0  Chicken      120             1                4
0  Chicken      120             1                4

我想要这种格式:

   dilevery_time  dispatch_time  source_long   Address  name  chef_name  
0  21:39:37.265   21:47:37.265    -73.955741    Dmart   John   xyz
0  21:39:37.265   21:47:37.265    -73.955741    Dmart   John   abc

  dish_name   dish_price    dish_quantity    ratings
0  Chicken      120             1                4
0  Paneer       100             2                3

如何在熊猫中做到这一点?

【问题讨论】:

    标签: python pandas dataframe


    【解决方案1】:

    这是因为在第一个数据帧中,您有两次索引 0。你可以使用reset_index 方法然后得到你的结果:

    In [9]: df
    Out[9]: 
      chef_name dish_name  dish_price  dish_quantity  ratings
    0       xyz   Chicken         120              1        4
    1       abc    Paneer         100              2        3
    
    In [10]: df1
    Out[10]: 
      chef_name dish_name  dish_price  dish_quantity  ratings
    0       xyz   Chicken         120              1        4
    1       abc    Paneer         100              2        3
    
    df1.reset_index(drop=True, inplace
    
    In [11]: pd.concat([df1, df2], join='inner', axis=1)
    Out[11]: 
      chef_name dish_name  dish_price  dish_quantity  ratings dilevery_time  \
    0       xyz   Chicken         120              1        4  21:39:37.265   
    1       abc    Paneer         100              2        3  21:39:37.265   
    
      dispatch_time  source_lat  source_long Address  name  
    0  21:47:37.265  -73.955741      40.3422   Dmart  John  
    1  21:47:37.265  -73.955741      40.3422   Dmart  John  
    

    【讨论】:

      猜你喜欢
      • 2021-07-28
      • 1970-01-01
      • 1970-01-01
      • 2021-02-14
      • 1970-01-01
      • 1970-01-01
      • 2021-12-18
      • 2021-07-31
      • 2017-11-13
      相关资源
      最近更新 更多