如何获取字符串中某个子字符串之前和之前的所有内容？答案

【问题标题】：How to get everything before and up to a certain substring within a string?如何获取字符串中某个子字符串之前和之前的所有内容？
【发布时间】：2017-11-16 14:03:05
【问题描述】：

如何将字符串返回到某个字符？

def get_header(s):
    '''(str) -> str
    Return the start of the given string upto and including
    </head>.'''
    return (s.split('</head>')[0])

这就是我所做的，但是，我不知道如何在 """" 之前获得所有内容并包括它。

例如：

s ="hello python world </head> , i'm a beginner "
get_header(s)

这会返回

"hello python world "<"/head">"   #without the quotient marks around the <

【问题讨论】：

发布示例输入和预期输出。

标签： python string python-3.x

【解决方案1】：

您的代码应该可以工作，但不会包含"</head>"，所以只需在末尾添加即可：

def get_header(s):
    '''(str) -> str
    Return the start of the given string upto and including
    </head>.'''
    return s.split('</head>')[0] + "</head>"

【讨论】：

【解决方案2】：

使用 Python 的 re 模块将“正则表达式”（或正则表达式）匹配到字符串是一件相当容易的事情。

以下是如何使用它来做你想做的事：

import re

def get_header(s):
    """(str) -> str
    Return the start of the given string upto and including </head>.
    """
    matches = re.search(r".*</head>", s)
    return matches.group(0) if matches else None

s = "hello python world </head> , i'm a beginner "
print(get_header(s))  # -> hello python world </head>

【讨论】：

【解决方案3】：

more_itertools 是实现split_after 工具的第三方库。安装方式：

> pip install more_itertools

给定

import more_itertools as mit


s = "hello python world </head> , i'm a beginner "

代码

pred = lambda x: x == "</head>"
" ".join(next(mit.split_after(s.split(), pred)))
# 'hello python world </head>'

字符串被空格分割成“单词”。完整的字符串在适合谓词的任何单词之后拆分。第一个结果被连接在一起。

【讨论】：