在'\id'之后获取字符串中的第一个单词答案

【问题标题】：Grab first word in string after '\id'在'\id'之后获取字符串中的第一个单词
【发布时间】：2012-07-13 14:24:09
【问题描述】：

如何获取字符串中'\id ' 之后的第一个单词？

字符串：

'\id hello some random text that can be anything'

蟒蛇

for line in lines_in:
    if line.startswith('\id '):
        book = line.replace('\id ', '').lower().rstrip()

我得到了什么

book = 'hello some random text that can be anything'

我想要什么

book = 'hello'

【问题讨论】：

标签： python regex

【解决方案1】：

一个选项：

words = line.split()
try:
    word = words[words.index("\id") + 1]
except ValueError:
    pass    # no whitespace-delimited "\id" in the string
except IndexError:
    pass    # "\id" at the end of the string

【讨论】：

我建议 word 使用默认值，方法是将 except 设置为 except (ValueError, IndexError): word = ''
@xhainingx：我不知道 OP 想要对不同的错误条件做什么，所以我只是指出了它们
是的，我并没有纠正你，只是建议一种可能的处理方法，因为这看起来不像是你从精通 python 的人那里看到的那种问题
我更喜欢这个，因为它请求原谅比请求许可更好

【解决方案2】：

>>> import re
>>> text = '\id hello some random text that can be anything'
>>> match = re.search(r'\\id (\w+)', text)
>>> if match:
        print match.group(1)

一个更完整的版本，它捕获'\id'之后的任何空格

re.search(r'\\id\s*(\w+)', text)

【讨论】：

@jamylak -- 显然我们的想法是一样的。我建议您将正则表达式更改为 r'\\id\s*(\w+)' 以捕获多个（或没有）空格。
@mgilson OP 说它是这样工作的，但无论如何这就是你的解决方案。尽管我今天的选票用完了，但我还是会投赞成票。
@jamylak 我正在考虑删除我的解决方案中的正则表达式部分，而不是你的 - 无论如何你都击败了我，因为你有更多的赞成票（和接受：^p）它会在社区中更显眼。
@mgilson 你的正则表达式是我的一个更完整的版本，你应该得到接受的答案，尽管 SvenMarnach 应该得到接受的答案，因为它不是正则表达式。
@mgilson 我以前从未这样做过，但我可以将其更改为社区 wiki 并添加您的解决方案吗？

【解决方案3】：

你不需要正则表达式，你可以这样做：

book.split(' ')[0]

但是有很多方法可以实现这一点

【讨论】：

【解决方案4】：

如果"\id" 和单词之间不必有空格，正则表达式就可以了。（如果空间有保证，那就使用拆分方案）：

import re
match=re.search(r'\\id\s*(\w+)',yourstring)
if match:
   print match.group(1)

或者其他方式（没有正则表达式）：

head,sep,tail=yourstring.partition(r'\id')
first_word=tail.split()[1]

【讨论】：

如果只有一个id，你应该改用str.partition
@jamylak -- 已更改。是否有理由推广分区而不是split？我想这有助于拆包，因为您确切地知道您将得到什么，但对于.split('\id',1) 也可以这样说。分区更快吗？

【解决方案5】：

尝试在您的字符串簿上使用str.split(' ')，它将按空格分隔并为您提供单词列表。然后就做book = newList[0]。

所以book = book.split(' ')[0]

【讨论】：

【解决方案6】：

由于您已经检查了以"\id " 开头的行，因此只需拆分字符串即可获得单词列表。如果你想要下一个，只需获取元素 #1：

>>> line="\id hello some random text that can be anything"
>>> line.split()
['\\id', 'hello', 'some', 'random', 'text', 'that', 'can', 'be', 'anything']
    #0      #1  ...

这样你的代码应该变成这样：

for line in lines_in:
    if line.startswith('\id '):
      book = line.split()[1]

【讨论】：