【问题标题】:Python pandas : How to find difference between two dataframe based on single columnPython pandas:如何根据单列查找两个数据框之间的差异
【发布时间】:2022-11-22 04:39:54
【问题描述】:

我有两个数据框

df1 = pd.DataFrame({
    'Date':['2013-11-24','2013-11-24','2013-11-25','2013-11-25'],
    'Fruit':['Banana','Orange','Apple','Celery'],
    'Num':[22.1,8.6,7.6,10.2],
    'Color':['Yellow','Orange','Green','Green'],
    })
print(df1)
         Date   Fruit   Num   Color
0  2013-11-24  Banana  22.1  Yellow
1  2013-11-24  Orange   8.6  Orange
2  2013-11-25   Apple   7.6   Green
3  2013-11-25  Celery  10.2   Green

df2 = pd.DataFrame({
    'Date':['2013-11-25','2013-11-25','2013-11-25','2013-11-25','2013-11-25','2013-11-25'],
    'Fruit':['Banana','Orange','Apple','Celery','X','Y'],
    'Num':[22.1,8.6,7.6,10.2,22.1,8.6],
    'Color':['Yellow','Orange','Green','Green','Red','Orange'],
    })
print(df2)
         Date   Fruit   Num   Color
0  2013-11-25  Banana  22.1  Yellow
1  2013-11-25  Orange   8.6  Orange
2  2013-11-25   Apple   7.6   Green
3  2013-11-25  Celery  10.2   Green
4  2013-11-25       X  22.1     Red
5  2013-11-25       Y   8.6  Orange

我试图根据列 Fruit 找出这两个数据帧之间的区别

这就是我现在正在做的,但我没有得到预期的输出

mapped_df = pd.concat([df1,df2],ignore_index=True).drop_duplicates(keep=False)
print(mapped_df)

预期产出

         Date Fruit   Num   Color
8  2013-11-25     X  22.1     Red
9  2013-11-25     Y   8.6  Orange

【问题讨论】:

标签: python pandas dataframe


【解决方案1】:

您可以使用否定的isin

output = df2.loc[~df2['Fruit'].isin(df1['Fruit'])]

【讨论】:

    猜你喜欢
    • 2018-07-16
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2020-12-03
    • 1970-01-01
    • 2019-03-18
    • 1970-01-01
    • 2020-03-30
    相关资源
    最近更新 更多