如何从另一个列字符串值中删除列字符串值？答案

【问题标题】：How to remove a column string value from another column string value?如何从另一个列字符串值中删除列字符串值？
【发布时间】：2018-12-29 17:15:27
【问题描述】：

我有一个这样的数据框

df:
col1                      col2
blue water bottle        blue
red wine glass           red
green cup                green

我想创建另一列，它将忽略来自col1 的col2 的值例如，新列 col3 将是：

water bottle
wine glass
green cup

我试过这段代码：

df.apply(lambda x: x['col1'].replace(x['col2'], ''), axis=1)

但我收到以下错误：

AttributeError: ("'NoneType' 对象没有属性 'replace'", '发生在索引 0')

怎么做？

【问题讨论】：

df["col3"] = df.apply(lambda x: x["col1"].replace(x["col2"], ""), axis=1) 应该可以工作。我猜你在某处有None 值。你可以检查df[df.col1.isnull()]

标签： python pandas dataframe

【解决方案1】：

原因是数据框中某些行的“col1”为无。您将需要处理这些情况，例如将空字符串分配给 col3

df["col3"] = df.apply(
    lambda x: "" if pd.isnull(x["col1"]) else x["col1"].replace(x["col2"], ""),
    axis=1
)

【讨论】：

【解决方案2】：

使用 -

df[['col','col2']].apply(lambda x: x[0].replace(x[1],''), axis=1)

输出

0     water bottle
1       wine glass
2              cup
dtype: object

【讨论】：

【解决方案3】：

在应用您的 lambda 之前删除带有 NaN 条目的行： df[['col1', 'col2']].dropna().apply(lambda x: x['col1'].replace(x['col2'], ''), axis=1)

【讨论】：

【解决方案4】：

这是一种方式（矢量化当然会给出更好的答案）

import pandas as pd

df = pd.DataFrame()
df['col'] = ['blue water bottle', 'red wine glass', 'green cup']
df['col2'] = ['blue', 'red', 'green']
df['col3'] = ['', '', '']
for idx, row in df.iterrows():
    row['col3'] = row['col'].replace(row['col2'], '').strip()

【讨论】：