【发布时间】:2017-04-18 15:05:57
【问题描述】:
对于我的编程课,我要根据以下描述创建一个函数:
参数是一条推文。此函数应返回包含推文中所有主题标签的列表,按照它们在推文中出现的顺序排列。返回列表中的每个主题标签都应删除初始哈希符号,并且主题标签应该是唯一的。 (如果一条推文两次使用相同的主题标签,则它仅包含在列表中一次。主题标签的顺序应与推文中每个标签第一次出现的顺序相匹配。)
我不确定如何制作,以便在遇到标点符号时结束主题标签(请参阅第二个 doctest 示例)。我当前的代码没有输出任何东西:
def extract(start, tweet):
""" (str, str) -> list of str
Return a list of strings containing all words that start with a specified character.
>>> extract('@', "Make America Great Again, vote @RealDonaldTrump")
['RealDonaldTrump']
>>> extract('#', "Vote Hillary! #ImWithHer #TrumpsNotMyPresident")
['ImWithHer', 'TrumpsNotMyPresident']
"""
words = tweet.split()
return [word[1:] for word in words if word[0] == start]
def strip_punctuation(s):
""" (str) -> str
Return a string, stripped of its punctuation.
>>> strip_punctuation("Trump's in the lead... damn!")
'Trumps in the lead damn'
"""
return ''.join(c for c in s if c not in '!"#$%&\'()*+,-./:;<=>?@[\\]^_`{|}~')
def extract_hashtags(tweet):
""" (str) -> list of str
Return a list of strings containing all unique hashtags in a tweet.
Outputted in order of appearance.
>>> extract_hashtags("I stand with Trump! #MakeAmericaGreatAgain #MAGA #TrumpTrain")
['MakeAmericaGreatAgain', 'MAGA', 'TrumpTrain']
>>> extract_hashtags('NEVER TRUMP. I'm with HER. Does #this! work?')
['this']
"""
hashtags = extract('#', tweet)
no_duplicates = []
for item in hashtags:
if item not in no_duplicates and item.isalnum():
no_duplicates.append(item)
result = []
for hash in no_duplicates:
for char in hash:
if char.isalnum() == False and char != '#':
hash == hash[:char.index()]
result.append()
return result
在这一点上我很迷茫;任何帮助,将不胜感激。先感谢您。
注意:我们不允许使用正则表达式或导入任何模块。
【问题讨论】:
-
好吧..如果你需要以标点符号结尾,并且没有那么多个标点符号,为什么不检查下一个字符是否是标点符号?跨度>