从字典更改列中的子字符串（Python Pandas）答案

【问题标题】：Change substring in a column from a dict (Python Pandas)从字典更改列中的子字符串（Python Pandas）
【发布时间】：2019-12-11 15:50:06
【问题描述】：

我从 2 个 excel 工作表（在同一个文件中）中获得了 2 个数据框。我想用第二张表2中存在的数据库中的“官方ID”更改第一张表1中每个分子的名称。

screen first dataframe

screen second dataframe

import pandas as pd
reactions = pd.read_excel ("/Users/Python/reactions.xlsx")

molecules = pd.read_excel ("/Users/Python/reactions.xlsx" , 
                          sheet_name= 'METS')

d = molecules.set_index('MOLID')['MOLNAME'].to_dict()
#not work 
reactions['EQUATION'] = reactions['EQUATION'].str.replace('\d+','').replace(d)

我在字典中有旧/新分子名称，我也是从第二张表中创建的：

d

它就像

{....'glucose[c]': 'glc_D', 'glucose[s]': 'glc_D', 'glucose[x]': 'glc_D', ....}

在第一个数据库中，我要更改分子名称的列称为 EQUATION，它类似于：“ATP[c] + 葡萄糖 [c] => ADP[c] + 葡萄糖 6 磷酸盐 [c]” 我尝试使用此代码进行更改，它不会出错，但我的数据框中的分子没有改变。

谢谢你的时间

【问题讨论】：

这应该可以工作：reactions['EQUATION'].str.replace('\d+','').map(d)
嗨，约翰，如果您向我们展示与您的数据相似的东西会更好minimal reproducible example
我也尝试了命令 .map 但它不起作用，每一行的列都变为 NaN
Hy @Datanovice 我刚刚上传了数据框的两个屏幕
您能否以文本格式向我们展示您预期的输入和输出，我们需要将您的数据复制到我们的 IDE 中以测试哪些有效哪些无效。

标签： python excel pandas

【解决方案1】：

How to replace multiple substrings in a Pandas series using a dictionary?

我认为如果您将代码调整为类似这样，您就可以做到

 reactions['EQUATION'].apply(lambda x: ' '.join([d.get(i, i) for i in x.split()]))

【讨论】：