Pandas - 有条件地连接两列答案

【问题标题】：Pandas - conditionally concat two columnsPandas - 有条件地连接两列
【发布时间】：2019-05-13 20:15:51
【问题描述】：

给定一个数据框

Patient ID     Instructions    ID Replaced
   1                N/A           ID123
   2                              ID124
   3                              ID125
   4                xyz           ID126
   5                xyz           ID127
   6                              ID128
   7                Replacement   ID129
   8                Replace       ID130
   9                replaced      ID131
   10               xyz           ID132

如果找到replac 子字符串，我如何创建一个新列，它将Instructions 与ID Replaced 连接起来？

Patient ID  Instructions    ID Replaced     Comments
    1           N/A            ID123    
    2                          ID124    
    3                          ID125    
    4           xyz            ID126    
    5           xyz            ID127    
    6                          ID128    
    7           Replacement    ID129    Replacement | ID129
    8           Replace        ID130    Replace | ID130
    9           Replaced       ID131    Replaced | ID131
    10          xyz            ID132

我尝试了以下方法，但 Comments 列完全为空

mani_df['Comments'] = ""
# if instructions contains 'replac' , concat with ID replaced 
if "replace" in df['Instructions']:
    df['Comments'] = df['Instructions'].str.cat(df['ID Replaced'], sep = " | ")

我尝试使用布尔掩码，但前两行返回 False

mask = mani_df['Special Handling Directions'].str.contains('replac')

    Out[55]: 
    0    False
    1    False
    2      NaN
    3      NaN

【问题讨论】：

if "replace" in df['Instructions']: "replace" 不是df['Instruction'] 系列中的值，因此if 子句不会被执行。你需要像df['Instruction'].str.contains('replace') 这样的东西，它会给你一个布尔掩码。

标签： python pandas concat

【解决方案1】：

您可以使用str.contains 和case=False 并使用pandas indexing 连接

mask = df.Instructions.str.contains('Replace', case=False).fillna(False)

df['Comments'] = df.loc[mask, 'Instructions'] + ' | ' + df['ID Replaced']

当然，你可以在最后fillna 得到空字符串（看起来像你预期的输出）

df.fillna('')

产量

    Patient ID  Instructions    ID Replaced Comments
0   1                           ID123   
1   2                           ID124       
2   3                           ID125       
3   4           xyz             ID126   
4   5           xyz             ID127   
5   6                           ID128       
6   7           Replacement     ID129       Replacement | ID129
7   8           Replace         ID130       Replace | ID130
8   9           replaced        ID131       replaced | ID131
9   10          xyz             ID132

【讨论】：

作为后续问题，有没有一种简单的方法可以将其更改为if else？具体来说，如果mask 是FALSE，则仅将Instructions 中的值复制到Comments 中的那一行？
没关系，最终将掩码写入列，创建 Comments 列的副本并进行比较。谢谢阿甘！