【问题标题】:For loop outputting duplicatesFor循环输出重复项
【发布时间】:2017-10-25 10:41:48
【问题描述】:
a = {'1330': ('John', 'Gold', '1330'), "0001":('Matt', 'Wade', '0001'), '2112': ('Bob', 'Smith', '2112')}
com = {'6':['John Gold, getting no points', 'Matt played in this game? Didn\'t notice him','Love this shot!']}
comments_table = []

我想用这个替换函数实现的是将 com(dict) 中找到的字符串中的人名替换为他们唯一的代码,该代码通过正则表达式在 a(dict) 中找到。用代码替换名称是可行的,但是用代码而不是名称添加新字符串是我出错的地方。

def replace_first_name():
for k,v in a.items():
    for z, y in com.items():
        for item in y:
            firstname = a[k][0]
            lastname = a[k][1]
            full_name = firstname + ' ' + lastname
            if firstname in item:
                if full_name in item:
                    t = re.compile(re.escape(full_name), re.IGNORECASE)
                    comment = t.sub(a[k][2], item)
                    print ('1')
                    comments_table.append({
                        'post_id': z, 'comment': comment
                    })
                    continue

                else:

                    t = re.compile(re.escape(firstname), re.IGNORECASE)
                    comment = t.sub(a[k][2], item)
                    print ('2')
                    comments_table.append({
                        'post_id':z, 'comment':comment
                    })
            else:
                print ('3')
                if fuzz.ratio(item,item) > 90:
                    comments_table.append({
                        'post_id': z, 'comment': item
                    })
                else:
                    pass

问题在于如下所示的输出:

[{'comment': '1330, getting no points', 'post_id': '6'}, {'comment': "Matt played in this game? Didn't notice him", 'post_id': '6'}, {'comment': 'Love this shot!', 'post_id': '6'}, {'comment': 'John Gold, getting no points', 'post_id': '6'}, {'comment': "Matt played in this game? Didn't notice him", 'post_id': '6'}, {'comment': 'Love this shot!', 'post_id': '6'}, {'comment': 'John Gold, getting no points', 'post_id': '6'}, {'comment': "0001 played in this game? Didn't notice him", 'post_id': '6'}, {'comment': 'Love this shot!', 'post_id': '6'}]

我不希望已经将名字替换为数字的 cmets 进入最终名单。因此,我希望我的预期输出如下所示:

[{'comment': '1330, getting no points', 'post_id': '6'},{'comment': '0001,played in this game? Didn\'t notice him', 'post_id': '6', {'comment':'Love this shot', 'post_id':'6'}]

我已经通过将 y 设为 iter_list 来研究使用迭代器,但我没有得到任何结果。任何帮助,将不胜感激。谢谢!

【问题讨论】:

    标签: python list loops if-statement iterator


    【解决方案1】:

    不确定为什么要进行正则表达式替换,因为您正在检查名字/全名是否与in 一起出现。也不确定案例 3 中的 fuzz.ratio(item, item) 应该做什么,但您可以通过以下方式进行简单/幼稚的替换:

    #!/usr/bin/python
    import re
    
    def replace_names(authors, com):
        res = []
        for post_id, comments in com.items():
            for comment in comments:
                for author_id, author in authors.items():
                    first_name, last_name = author[0], author[1]
                    full_name = first_name + ' ' + last_name
                    if full_name in comment:
                        comment = comment.replace(full_name, author_id)
                        break
                    elif first_name in comment:
                        comment = comment.replace(first_name, author_id)
                        break
                res.append({'post_id': post_id, 'comment': comment})
        return res
    
    a = {'1330': ('John', 'Gold', '1330'), "0001":('Matt', 'Wade', '0001'), '2112': ('Bob', 'Smith', '2112')}
    com = {'6':['John Gold, getting no points', 'Matt played in this game? Didn\'t notice him','Love this shot!']}
    for comment in replace_names(a, com):
        print comment
    

    产生这个输出:

    {'comment': '1330, getting no points', 'post_id': '6'}
    {'comment': "0001 played in this game? Didn't notice him", 'post_id': '6'}
    {'comment': 'Love this shot!', 'post_id': '6'}
    

    要理解原始代码的意图有点棘手,但是(其中一个)您得到重复的原因是 您正在外部循环中处理作者,这意味着您将为每位作者处理一次评论。通过交换循环,您可以确保每个评论只处理一次。

    您可能还打算在拥有continue 的地方拥有break,但我不完全确定我理解您的原始代码应该如何工作。

    全局变量的使用也有点混乱。

    【讨论】:

      猜你喜欢
      • 2015-11-27
      • 2016-02-03
      • 2017-03-12
      • 2016-05-02
      • 2020-03-26
      • 2017-08-26
      • 1970-01-01
      • 2013-05-20
      • 1970-01-01
      相关资源
      最近更新 更多