【问题标题】:Joining two Dataframes based on index and column根据索引和列连接两个数据框
【发布时间】:2017-05-17 13:47:39
【问题描述】:

我有以下两个数据框:

df1:

            Id
date          
2014-03-13   1
2014-03-14   2
2014-03-15   1

df2:

            Id people  value
date                        
2014-03-13   1      A   -3.0
2014-03-13   1      B   -6.0
2014-03-13   4      C   -3.2
2014-03-14   1      A   -3.1
2014-03-14   2      B   -5.0
2014-03-14   2      C   -3.4
2014-03-14   7      D   -6.2
2014-03-14   8      E   -3.2
2014-03-15   1      A   -3.2
2014-03-15   3      B   -5.9

我想做的是根据Id合并这两个Dataframe,与索引(date)保持一致。

期望的结果如下:

            Id people  value
date                        
2014-03-13   1      A   -3.0
2014-03-13   1      B   -6.0
2014-03-14   2      B   -5.0
2014-03-14   2      C   -3.4
2014-03-15   1      A   -3.2

我已经努力使用mergejoin,但没有成功。

生成输入的代码如下:

import pandas as pd

dates = [pd.to_datetime('2014-03-13', format='%Y-%m-%d'), pd.to_datetime('2014-03-14', format='%Y-%m-%d'), pd.to_datetime('2014-03-15', format='%Y-%m-%d')]
Ids = [1,2,1]
df1 = pd.DataFrame({'Id': pd.Series(Ids, index=dates)})
df1.index.name = 'date'

dates = [pd.to_datetime('2014-03-13', format='%Y-%m-%d'), pd.to_datetime('2014-03-13', format='%Y-%m-%d'),
         pd.to_datetime('2014-03-13', format='%Y-%m-%d'), pd.to_datetime('2014-03-14', format='%Y-%m-%d'),
         pd.to_datetime('2014-03-14', format='%Y-%m-%d'),pd.to_datetime('2014-03-14', format='%Y-%m-%d'),
         pd.to_datetime('2014-03-14', format='%Y-%m-%d'), pd.to_datetime('2014-03-14', format='%Y-%m-%d'), 
         pd.to_datetime('2014-03-15', format='%Y-%m-%d'), pd.to_datetime('2014-03-15', format='%Y-%m-%d')]
Ids = [1,1,4,1,2,2,7,8,1,3]
peoples = ['A','B','C','A','B','C','D','E','A','B']
values = [-3,-6,-3.2,-3.1,-5,-3.4,-6.2,-3.2,-3.2,-5.9]
df2 = pd.DataFrame({'Id': pd.Series(Ids, index=dates),
                    'people': pd.Series(peoples, index=dates),
                    'value': pd.Series(values, index=dates)})
df2.index.name = 'date'

【问题讨论】:

    标签: python join dataframe merge


    【解决方案1】:

    最简单的就是merge + reset_index

    df = pd.merge(df1.reset_index(), df2.reset_index(), on=['date','Id']).set_index('date')
    print (df)
    date        Id people  value
    2014-03-13   1      A   -3.0
    2014-03-13   1      A   -6.0
    2014-03-14   2      B   -5.0
    2014-03-14   2      A   -3.4
    2014-03-15   1      F   -3.2
    

    还有:

    df = pd.merge(df1.set_index('Id', append=True), 
                 df2.set_index('Id', append=True), 
                 left_index=True, 
                 right_index=True)
    print (df)
                  people  value
    date       Id              
    2014-03-13 1       A   -3.0
               1       A   -6.0
    2014-03-14 2       B   -5.0
               2       A   -3.4
    2014-03-15 1       F   -3.2
    

    【讨论】:

      猜你喜欢
      • 2021-04-21
      • 2015-10-28
      • 2022-01-12
      • 2020-12-04
      • 1970-01-01
      • 1970-01-01
      • 2018-04-30
      • 1970-01-01
      • 1970-01-01
      相关资源
      最近更新 更多