【发布时间】:2018-03-25 03:07:10
【问题描述】:
我有两个想要合并的 pandas 数据框。数据框有不同的列和重叠的索引。我想合并它们,保持索引的顺序不变。
数据帧 (d1)
Dec 16 Dec 15
Balance Sheet
NON-CURRENT LIABILITIES NaN NaN <-- 'all Nan' row
Other Long Term Liabilities 8.37 9.30
Long Term Provisions 13.53 12.74 <-- Not present in d2
Total Non-Current Liabilities 21.90 22.04
CURRENT LIABILITIES NaN NaN <-- 'all Nan' row
Trade Payables 32.49 24.26
数据帧 (d2)
Dec 11 Dec 10
Balance Sheet
NON-CURRENT LIABILITIES NaN NaN
Deferred Tax Liabilities [Net] 0.00 7.40 <-- Not present in d1
Other Long Term Liabilities 14.13 0.00
Total Non-Current Liabilities 14.13 7.40
CURRENT LIABILITIES NaN NaN
Trade Payables 77.35 60.40
我尝试了以下方法来合并这些数据框,但都没有奏效。
d1.merge(d2, how='left', left_index=True,right_index=True)
d1.merge(d2, how='outer', left_index=True,right_index=True)
pd.merge_ordered(d1,d2,left_on=['Dec 16'],right_on=['Dec 11'])
pd.concat([d1.merge(d2, how='left', left_index=True,right_index=True),d1.merge(d2, how='right', left_index=True,right_index=True)]).drop_duplicates(subset='Dec 16',keep='last')
我希望生成的数据框看起来像这样
Dec 16 Dec 15 Dec 11 Dec 10
Balance Sheet
NON-CURRENT LIABILITIES NaN NaN NaN NaN
Deferred Tax Liabilities [Net] NaN NaN 0.00 7.40 <-- from d2
Other Long Term Liabilities 8.37 9.30 14.13 0.00 <-- d1+d2 merged
Long Term Provisions 13.53 12.74 NaN NaN <-- from d1
Total Non-Current Liabilities 21.90 22.04 14.13 7.40 <-- d1+d2 merged
CURRENT LIABILITIES NaN NaN NaN NaN
Trade Payables 32.49 24.26 77.35 60.40
请注意,整体顺序很重要(例如,所有 NaN 行需要以相同的顺序排列),但不是“所有 NaN”行之间合并索引的顺序。 d1 的列也应该在 d2 列之前。
【问题讨论】:
标签: python-3.x pandas join merge