如何在python字符串中提取某个单词之前的单词答案

【问题标题】：how to extract the word before a certain word in a python string如何在python字符串中提取某个单词之前的单词
【发布时间】：2021-10-14 12:09:52
【问题描述】：

假设我有一个 df 的 python 字符串：

  string
0 this house has 3 beds inside 
1 this is a house with 2 beds in it
2 the house has 4 beds

我想提取每个房子有多少张床。我觉得这样做的一个好方法是在beds 之前找到该项目。

在尝试解决这个问题时，我当然注意到字符串是按字符索引的。这意味着我必须将字符串转换为带有str.split(' ') 的列表。

然后，我可以在每个字符串中找到 'beds' 的索引，并返回之前的索引。我为此尝试了列表理解和df.iterrows()，但似乎无法找出正确的方法。我想要的输出是：

  string                            beds
0 this house has 3 beds inside        3
1 this is a house with 2 beds in it   2
2 the house has 4 beds                4

【问题讨论】：

标签： python pandas

【解决方案1】：

看efficient way to get words before and after substring in text (python)

在你的情况下，你可以这样做

for index, row in df.iterrrows(): 
    row['beds'] = row['string'].partition('bed')[0].strip()[-1]

partition 函数根据单词拆分字符串并返回tuple strip 函数仅用于删除空格。如果一切正常，那么您要查找的数字将位于元组第一个值的末尾。因此[0]

for index, row in df.iterrrows(): 
    row['beds'] = row['string'].partition('bed')[0].strip()[-1]

如果为了更好的可读性对上面的代码进行了分解：

for index, row in df.iterrrows(): 
    split_str = row['string'].partition('bed')
    word_before_bed = split_str[0].strip()
    number_of_beds = word_before_bed[-1]
    row['beds'] = number_of_beds #append column to existing row

print(df.head())

输出 df 将有 3 列。

注意：这是一个快速的“hack”。请注意，循环中没有错误检查。您应该添加错误检查，因为您永远不知道“床”一词是否出现在行中。

【讨论】：

这不能只是一个矢量化操作而不是循环遍历数据帧吗？