删除标点python答案

【问题标题】：Removing Punctuation python删除标点python
【发布时间】：2012-11-20 01:36:15
【问题描述】：

在您发布链接之前删除标点符号的最佳方式。

我正在创建一个 madLib 游戏，用名词、副词、动词、形容词等替换段落中的单词；它应该从一个单独的文件中取出一个随机单词并将其打印到适当的段落中，即动词running 将被放置在段落状态为动词的位置。我遇到的唯一问题是，当我要替换的单词旁边有标点符号时，我无法做到这一点。比如VERB，或者VERB！

我的问题是如何在保留标点符号的同时替换所有这些值。

【问题讨论】：

是否必须将空格格式化为 UPPERCASE_PART_OF_SPEECH？如果可以，请尝试将其更改为“{part-of-speech}”。（即This is a {verb}!）。 Python 有一个非常通用的字符串格式化程序，非常适合处理这样的事情。

标签： python python-3.x

【解决方案1】：

noun1="Donkey"
print("This should print a %s here"%(noun1))

基本上，您可以获取输入变量，并像本示例一样对待它们。

【讨论】：

【解决方案2】：

不确定您的用例，但将count 参数设置为1 的replace 是否有效？

>>> test = 'This is a VERB! Whoa, a VERB? Yeah, a VERB!#$%'
>>> test.replace('VERB', 'running', 1)
'This is a running! Whoa, a VERB? Yeah, a VERB!#$%'
>>> test.replace('VERB', 'running', 1).replace('VERB', 'swimming', 1).replace('VERB', 'sleeping', 1)
'This is a running! Whoa, a swimming? Yeah, a sleeping!#$%'

当然，您必须对重复次数进行一些调整，但它应该可以很好地处理标点符号。

根据下面@mgilson 的建议，您可以通过以下方式删除对replace 的大量调用：

In [14]: s = 'This is a VERB! Whoa, a VERB? Yeah, a VERB!#$%'

In [15]: verbs = ['running', 'jumping', 'swimming']

In [16]: reduce(lambda x, y: x.replace('VERB', y, 1), verbs, s)
Out[16]: 'This is a running! Whoa, a jumping? Yeah, a swimming!#$%'

这使用reduce 函数在主字符串上运行replace，使用verbs 中的值作为替换值。 reduce 的最后一个参数是字符串本身，它将包含每次迭代的替换结果（并且一开始将是“正常”字符串）。

【讨论】：

Argv！我只是在输入这个（+1）。您甚至可以使用reduce 摆脱test.replace().replace().replace()... :)
@mgilson 哈哈，我在引导我内心的 mgilson。
@mgilson 我可以有把握地说，你所说的这让我试图更好地理解 reduce。更新我认为可行的方法，一如既往，感谢您推动我的想法:)
@mgilson 我想我欠你版税——今天第三次使用这个reduce 构造。再次感谢您让我跳出框框思考:)

【解决方案3】：

使用 re 模块中的 sub 函数。捕获单词后面的字符，然后用新单词替换单词并使用反向引用附加捕获的标点符号：

>>> import re
>>> s = "VERB,"
>>> print re.sub(r'VERB([\,\!\?\;\.]?)', r'newword\1', s)
newword,

您可以扩展字符类[\,\!\?\;\.] 以包含您希望遇到的任何标点符号，这只是一个示例。

【讨论】：

【解决方案4】：

子函数可以很好地解决您的问题

from re import *
contents = 'The !ADJECTIVE! panda walked to the !NOUN? and then *VERB!. A nearby <NOUN> was unaffected by these events.'
print('Enter an adjective: ', end = '')
adj = input()
print('Enter a noun: ', end = '')
noun1 = input()
print('Enter a verb: ', end = '')
verb = input()
print('Enter a noun: ', end = '')
noun2 = input()
contents = sub(r'adjective',adj,contents,count = 1, flags = IGNORECASE)
contents = sub(r'noun',noun1,contents,count = 1, flags = IGNORECASE)
contents = sub(r'verb',verb,contents,count = 1, flags = IGNORECASE)
contents = sub(r'noun',noun2,contents,count = 1, flags = IGNORECASE)

子函数有五个参数。 re.sub（要查找的表达式，要替换的字符串，进行替换的字符串，计数，即应替换的出现次数，IGNORECASE 查找所有情况，无论大小写如何）代码输出

Enter an adjective: silly
Enter a noun: chandelier
Enter a verb: screamed
Enter a noun: pickup truck
The !silly! panda walked to the !NOUN? and then *VERB!. A nearby <NOUN> was
unaffected by these events.
The !silly! panda walked to the !chandelier? and then *VERB!. A nearby <NOUN> was
unaffected by these events.
The !silly! panda walked to the !chandelier? and then *screamed!. A nearby <NOUN> was
unaffected by these events.
The !silly! panda walked to the !chandelier? and then *screamed!. A nearby <pickup truck> was
unaffected by these events.

标点符号不受这些事件的影响。希望这会有所帮助

【讨论】：

【解决方案5】：

string.punctuation 包含以下字符：

'!"#$%&\'()*+,-./:;?@[\]^_`{|}~'

您可以使用 translate 和 maketrans 函数将标点符号映射到空值（替换）

import string

'This, is. A test! VERB! and VERB,'.translate(str.maketrans('', '', string.punctuation))

输出：

'This is A test VERB and VERB'

【讨论】：