按字符位置列表拆分字符串答案

【问题标题】：Split a string by list of character positions按字符位置列表拆分字符串
【发布时间】：2025-07-14 11:10:01
【问题描述】：

假设你有一个字符串：

text = "coding in python is a lot of fun"

以及人物位置：

positions = [(0,6),(10,16),(29,32)]

这些是区间，涵盖文本中的某些单词，即分别为编码、python 和 fun。

使用字符位置，你怎么能分割那些单词上的文本，以获得这个输出：

['coding','in','python','is a lot of','fun']

这只是一个例子，但它应该适用于任何字符串和任何字符位置列表。

我不是在找这个：

[text[i:j] for i,j in positions]

【问题讨论】：

你知道你可以使用像text[0:5]（一个“字符串切片”）这样的东西吗？
因为您要在字符位置指定的范围内的单词上拆分文本。您没有在每个范围内创建文本列表，这将为 3 个字符位置范围中的每一个返回 3 个字符串。这就像我想要 text.split（在字符范围列表上）
字符位置和字符串有什么关系？根据预期的输出看起来它们是不相关的
字符位置给出了text中“coding”、“python”和“fun”这几个词的开始和结束位置

标签： python string

【解决方案1】：

以下代码按预期工作

text = "coding in python is a lot of fun"
positions = [(0,6),(10,16),(29,32)]
textList = []
lastIndex = 0
for indexes in positions:
    s = slice(indexes[0], indexes[1])
    if positions.index(indexes) > 0:
        print(lastIndex)
        textList.append(text[lastIndex: indexes[0]])
    textList.append(text[indexes[0]: indexes[1]])
    lastIndex = indexes[1] + 1
print(textList)

输出：['coding', 'in', 'python', 'is a lot of', 'fun']

注意：如果不需要空间，您可以修剪它们

【讨论】：

【解决方案2】：

我会将positions 扁平化为[0,6,10,16,29,32]，然后做类似的事情

positions.append(-1)
prev_positions = [0] + positions
words = []
for begin, end in zip(prev_positions, positions):
    words.append(text[begin:end])

这个确切的代码产生['', 'coding', ' in ', 'python', ' is a lot of ', 'fun', '']，所以它需要一些额外的工作来去除空格

【讨论】：

把它改成positions.append(None)，然后做[word.strip() for word in words if word.strip()]，那么它似乎适用于一堆不同的测试用例.
@Data 为什么推荐None？我添加了-1，因为'fun'可能不在句尾，所以切片后我们需要切片句子的剩余部分（text[fun_end:-1]）
@Data，我已经仔细检查过了，你是对的。 -1 是我的一个错误，它省略了最后一个字符，而 None 没有
带-1，有时会漏掉结束字符