使用 pandas 组合来自 2 个单独 DataFrame 的相应列答案

【问题标题】：Combining the respective columns from 2 separate DataFrames using pandas使用 pandas 组合来自 2 个单独 DataFrame 的相应列
【发布时间】：2020-08-07 19:57:06
【问题描述】：

我有 2 个具有相同列集但值不同的大型 DataFrame。我需要将各个列（此处为 A 和 B，实际数据中可能更多）中的值组合成同一列中的单个值（请参阅下面的所需输出）。我有一种使用np.vectorize 和df.to_numpy() 的快速方法来实现它，但我正在寻找一种严格使用pandas 来实现它的方法。这里的标准是首先是代码的可读性，然后是时间复杂度。

df1 = pd.DataFrame({'A':[1,2,3,4,5], 'B':[5,4,3,2,1]})
print(df1)

和，

df2 = pd.DataFrame({'A':[10,20,30,40,50], 'B':[50,40,30,20,10]})
print(df2)

我有一种非常快的方法-

#This function might change into something more complex
def conc(a,b):
    return str(a)+'_'+str(b)

conc_v = np.vectorize(conc)

required = pd.DataFrame(conc_v(df1.to_numpy(), df2.to_numpy()), columns=df1.columns)
print(required)

#Required Output
      A     B
0  1_10  5_50
1  2_20  4_40
2  3_30  3_30
3  4_40  2_20
4  5_50  1_10

寻找解决此问题的替代方法（严格来说是 pandas）。

【问题讨论】：

您有一个快速的解决方案，但您正在寻找一个更具可读性的解决方案？
使用 pandas 更易读的可能是df1.astype(str) + '_' + df2.astype(str)?
@LarsSkaug 完全正确
@Ben.T 是的，太棒了！谢谢！

标签： python-3.x pandas numpy dataframe

【解决方案1】：

这里的标准首先是代码的可读性

另一种简单的方法是使用add 和radd

df1.astype(str).add(df2.astype(str).radd('-'))

     A     B
0  1-10  5-50
1  2-20  4-40
2  3-30  3-30
3  4-40  2-20
4  5-50  1-10

【讨论】：