【问题标题】：Python>Pandas>Summing columns in different data frames which have same column names, same index values but not same same length of indexPython>Pandas>对具有相同列名、相同索引值但不同索引长度的不同数据帧中的列求和
【发布时间】：2021-08-24 20:17:33
【问题描述】：

我有两个如下所示的数据框。我想对 df2 和 df1 求和并覆盖 df1 以反映这个总和。虽然列名在两个数据帧中都匹配，甚至索引也有相似的值，但 DF2 的大小更小，并且没有所有行（或索引值）。我怎样才能最好地进行此操作？ “Buckets”是两个数据框上的索引。

【问题讨论】：

尝试加入 2 个数据帧，df1 = pd.merge(df1, df2, on='Buckets', how='left') 因为 df1 有更多行（或根据您的数据尝试外连接)，然后将 2 个 EUR 列（可能是 EUR_x+EUR_y）相加到一个 diff 列中。

标签： python pandas

【解决方案1】：

不用合并，我们用pandas内部数据对齐索引：

df1.set_index("Buckets")\
   .add(df2.set_index("Buckets"), fill_value=0)\
   .reset_index()

输出：

  Buckets    EUR
0     20Y  200.0
1     25Y  200.0
2     30Y  200.0
3     35Y  200.0

注意：如果 Buckets 已经在索引中，您可以省略 set_index。做，df1.add(df2, fill_value=0)

【讨论】：

【解决方案2】：

试试这个（左或外连接类型，你可以根据你的数据决定）

df1 = pd.merge(df1, df2, on=['Buckets'], how='left').set_index(['Buckets']).sum(axis=1).reset_index()
 #  .set_index(['Buckets'])  this is optional for you, as it is already index(as mentioned by you)
 # output,  You may have to rename column  0 to EUR after that
  Buckets      0
0     20Y  200.0
1     25Y  200.0
2     30Y  200.0
3     35Y  200.0

或者试试这个

df1 = pd.merge(df1, df2, on=['Buckets'], how='left')
# you wll have 2 columns for EUR(as both df1 and df2 has it) suffixed as _x and _y
df1['EUR_y'] = df1['EUR_y'].fillna(0)  # as NaN will create issue
df1['EUR'] = df1['EUR_x'] +df1['EUR_y']
# o/p
>>> df1
  Buckets  EUR_x  EUR_y    EUR
0     20Y    100  100.0  200.0
1     25Y    200    0.0  200.0
2     30Y    200    0.0  200.0
3     35Y    400 -200.0  200.0

【讨论】：