根据条件在 loc 中动态获取列答案

【问题标题】：Get column dynamically in loc based on condition根据条件在 loc 中动态获取列
【发布时间】：2021-11-15 07:41:44
【问题描述】：

我有一个数据框：

df.head() :
col_a col_a Month 1 Month 2 Month 3 Month 4 Month 5 Month 6
10      2
20      6
44      3
55      1 
86      4
67      5

我想要做什么：我希望在特定月份根据 col_b 分配 col_a 的值，例如，column_a 的第一个值（即 10）应该根据来自 col_b 的 2 分配给第 2 个月同样，对于 col_a=67 应该分配给第 5 个月

输出：

col_a    col_b Month 1 Month 2 Month 3 Month 4 Month 5 Month 6
    10      2            10
    20      6                                           20
    44      3                    44
    55      1   55
    86      4                             86
    67      5                                     67

我可以通过遍历每一行并从 col_b 中提取值并使用正则表达式匹配适当的月份，然后分配值来做到这一点。由于我有大量 3000+ 行，这需要时间。有人可以提供更好的方法吗？

PS:- dtype 是 str 而不是 int。

【问题讨论】：

请提供minimal reproducible example，熊猫请看这里：stackoverflow.com/questions/20109391/…
这是一种不寻常的数据操作——似乎col_b 提供了一种更合理的数据格式。为什么要这样做？

标签： python python-3.x pandas dataframe pandas-loc

【解决方案1】：

试试pivot 然后update

df.update(df[['col_a','col_b']].pivot(columns='col_b',values='col_a').add_prefix('Month '))

出来

df
Out[234]: 
   col_a  col_b  Month 1  Month 2  Month 3  Month 4  Month 5  Month 6
0     10      2     NaN    10.0     NaN     NaN     NaN     NaN
1     20      6     NaN     NaN     NaN     NaN     NaN    20.0
2     44      3     NaN     NaN    44.0     NaN     NaN     NaN
3     55      1    55.0     NaN     NaN     NaN     NaN     NaN
4     86      4     NaN     NaN     NaN    86.0     NaN     NaN
5     67      5     NaN     NaN     NaN     NaN    67.0     NaN

【讨论】：