【发布时间】:2019-03-31 06:28:19
【问题描述】:
我正在尝试替换数据框中的列表名称(C 列):
姓名列表(小例子,列表太大):
Jack
Liam
John
Ethan
George
...
小数据框示例:
A B C
French house Phone <phone_numbers>
English house email <adresse_mail>
French apartment my name is Liam
French house Hello George
English apartment Ethan, my phone is <phone_numbers>
我的脚本:
import re
import pandas as pd
from pandas import Series
df = pd.read_excel('data_frame.xlsx')
data = Series.to_string(df['C'])
first_names = open('names_list.txt', 'r')
names_read = first_names.readlines()
def names(data):
names_regex = re.compile(r'\b%s\b' % r'\b|\b'.join(map(re.escape, names_read)))
replace_names = names_regex.sub('<name>', data)
return replace_names
no_names = names(data)
print(no_names)
作为输出,我有我的整个数据框没有任何修改...
我预计:
C
Phone <phone_numbers>
email <adresse_mail>
my name is <name>
Hello <name>
<name>, my phone is <phone_numbers>
【问题讨论】:
标签: python pandas list dataframe replace