【问题标题】:python map each word to its own textpython将每个单词映射到它自己的文本
【发布时间】:2022-01-01 13:28:10
【问题描述】:

我有一个这样的单词列表:

 word_list=[{"word": "python",
    "repeted": 4},
    {"word": "awsome",
    "repeted": 3},
    {"word": "frameworks",
    "repeted": 2},
    {"word": "programing",
    "repeted": 2},
    {"word": "stackoverflow",
    "repeted": 2},
    {"word": "work",
    "repeted": 1},
    {"word": "error",
    "repeted": 1},
    {"word": "teach",
    "repeted": 1}
    ]

,来自另一个笔记列表:

note_list = [{"note_id":1,
"note_txt":"A curated list of awesome Python frameworks"},
{"note_id":2,
"note_txt":"what is awesome Python frameworks"},
{"note_id":3,
"note_txt":"awesome Python is good to wok with it"},
{"note_id":4,
"note_txt":"use stackoverflow to lern programing with python is awsome"},
{"note_id":5,
"note_txt":"error in programing is good to learn"},
{"note_id":6,
"note_txt":"stackoverflow is very useful to share our knoloedge"},
{"note_id":7,
"note_txt":"teach, work"},
  ]

我想知道如何将每个单词映射到它自己的注释:

maped_list=[{"word": "python",
        "notes_ids": [1,2,3,4]},
        {"word": "awsome",
        "notes_ids": [1,2,3]},
        {"word": "frameworks",
        "notes_ids": [1,2]},
        {"word": "programing",
        "notes_ids": [4,5]},
        {"word": "stackoverflow",
        "notes_ids": [4,6]},
        {"word": "work",
        "notes_ids": [7]},
        {"word": "error",
        "notes_ids": [5]},
        {"word": "teach",
        "notes_ids": [7]}
        ]

我的工作:

# i started by appending all the notes text into one list
notes_test = []
for note in note_list:
notes_test.append(note['note_txt'])
# calculate the reptition of each word
dict = {}
for sentence in notes_test:
    for word in re.split('\s', sentence): # split with whitespace
        try:
            dict[word] += 1
        except KeyError:
            dict[word] = 1
word_list= []
for key in dict.keys():
    word = {}
    word['word'] = key
    word['repeted'] = dict[key]
    word_list.append(word)

我的问题:

  1. 如何映射单词列表和笔记列表来获取映射列表
  2. 你如何发现我的代码质量,任何备注

【问题讨论】:

  • 你问这个问题的方式很混乱。我想你想问的是:“我有一个笔记列表,我需要计算每个单词的频率,以及它所在的笔记列表”。对吗?
  • 是的,类似的,我已经设法计算出频率,但我对笔记列表感到困惑

标签: python dictionary mapping


【解决方案1】:

您可以使用列表推导:

mapped_list = [{"word": w_dict["word"],
                "notes_ids": [n_dict["note_id"] for n_dict in note_list
                              if w_dict["word"].lower() in n_dict["note_txt"].lower()]
                } for w_dict in word_list]

结果是:

[{'word': 'python', 'notes_ids': [1, 2, 3, 4]},
 {'word': 'awsome', 'notes_ids': [4]},
 {'word': 'frameworks', 'notes_ids': [1, 2]},
 {'word': 'programing', 'notes_ids': [4, 5]},
 {'word': 'stackoverflow', 'notes_ids': [4, 6]},
 {'word': 'work', 'notes_ids': [1, 2, 7]},
 {'word': 'error', 'notes_ids': [5]},
 {'word': 'teach', 'notes_ids': [7]}]

【讨论】:

  • 谢谢你,能不能给我详细的信息
【解决方案2】:
  1. 尝试在创建字典时创建maped_list,在迭代时添加单词的索引。
  2. 不要使用dict作为变量,它是python创建dicts的保留名称,如dict(),如果使用它会被覆盖。此外,您的输入不包含除空格以外的任何其他空格,您可以使用 sentence.split()。您可以做的另一件事是将所有单词转换为小写,因此无论是否大写,它们都没有区别。

【讨论】:

  • 谢谢你的回答,2号是的,我很好,但我无法得到第一句话
  • 尝试在循环之前创建一个空白的 maped_list。当您将单词添加到 dict 时,也将其添加到 maped_list 中。就像,你在note id 2中,第一个单词是'what',在你将它添加到dict之后,也将它添加到maped_list,然后将note_id附加到note_ids标签。如果我的回答对你有帮助,请给我投票,我正在努力提高我的声誉。
猜你喜欢
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 2022-08-23
  • 2018-12-17
  • 1970-01-01
相关资源
最近更新 更多