Python：删除字符串中第一个字母之前的所有字符答案

【问题标题】：Python: delete all characters before the first letter in a stringPython：删除字符串中第一个字母之前的所有字符
【发布时间】：2017-09-21 11:30:35
【问题描述】：

经过彻底搜索，我可以找到如何删除特定字母之前的所有字符，而不是 any 字母之前的所有字符。

我正在尝试从中转换一个字符串：

"             This is a sentence. #contains symbol and whitespace

到这里：

This is a sentence. #No symbols or whitespace

我试过下面的代码，但是还是会出现第一个例子这样的字符串。

for ch in ['\"', '[', ']', '*', '_', '-']:
     if ch in sen1:
         sen1 = sen1.replace(ch,"")

这不仅由于某种未知原因无法删除示例中的双引号，而且也无法删除前导空格，因为它会删除所有个空格。

提前谢谢你。

【问题讨论】：

标签： python string

【解决方案1】：

要删除第一个字母之前的任何字符，而不仅仅是删除空格，请执行以下操作：

#s is your string
for i,x in enumerate(s):
    if x.isalpha()         #True if its a letter
    pos = i                   #first letter position
    break

new_str = s[pos:]

【讨论】：

【解决方案2】：

去掉所有空格和标点符号：

>>> text.lstrip(string.punctuation + string.whitespace)
'This is a sentence. #contains symbol and whitespace'

或者，另一种方法是找到第一个字符是 ascii 字母。例如：

>>> pos = next(i for i, x in enumerate(text) if x in string.ascii_letters)
>>> text[pos:]
'This is a sentence. #contains symbol and whitespace'

【讨论】：

【解决方案3】：

import re
s = "  sthis is a sentence"

r = re.compile(r'.*?([a-zA-Z].*)')

print r.findall(s)[0]

【讨论】：

我发现您的正则表达式对我正在尝试做的事情非常有用。谢谢，从 4 年后的未来！ :)

【解决方案4】：

这是一个非常基础的版本；即它使用 Python 初学者容易理解的语法。

your_string = "1324 $$ '!'     '' # this is a sentence."
while len(your_string) > 0 and your_string[0] not in "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz":
    your_string = your_string[1:]
print(your_string)

#prints "this is a sentence."

优点：简单，无需导入

缺点：如果您对使用列表推导感到自在，可以避免使用 while 循环。此外，您要比较的字符串可以使用正则表达式更简单。

【讨论】：

【解决方案5】：

删除直到第一个字母字符的所有内容。

import itertools as it


s = "      -  .] *    This is a sentence. #contains symbol and whitespace"
"".join(it.dropwhile(lambda x: not x.isalpha(), s))
# 'This is a sentence. #contains symbol and whitespace'

或者，迭代字符串并测试每个字符是否在黑名单中。如果为真则剥离字符，否则短路。

def lstrip(s, blacklist=" "):    
    for c in s:
        if c in blacklist:
            s = s.lstrip(c)
            continue
        return s

lstrip(s, blacklist='\"[]*_-. ')
# 'This is a sentence. #contains symbol and whitespace'

【讨论】：

【解决方案6】：

你可以使用 re.sub

import re
text = "             This is a sentence. #contains symbol and whitespace"

re.sub("[^a-zA-Z]+", " ", text)

re.sub（匹配模式，替换字符串，要搜索的字符串）

【讨论】：

这实际上会删除字符串中所有位置的非字母字符，而不仅仅是帖子标题中要求的第一个字母