如何获得单词/短语的语义答案

【问题标题】：How to get the se semantic meaning of a word/phrase如何获得单词/短语的语义
【发布时间】：2020-10-09 02:53:57
【问题描述】：

大家好，我需要帮助，目前我正在做一个项目，我必须找到一个单词/短语的语义含义。例如嗨，你好，早上好应该回复问候等...... 有什么建议吗？提前致谢

【问题讨论】：

欢迎堆栈溢出。到目前为止，您尝试过什么？
我认为您应该尝试在python nltkNLTK- Natural Language Toolkit 上进行更多搜索。我相信你会在那里找到它，我以前做过，我会更新
这不是语义，而是语用：一个短语在对话中的功能。原则上，除非您在受限域中工作，否则这是一个棘手的问题，我会说这是不可能的，因为没有对短语（语义）或其功能（语用）的含义进行编码的方案。

标签： python nlp

【解决方案1】：

您的问题有点含糊，但这里有两个想法可能会有所帮助：

1。词网

WordNet 是一个词汇数据库，提供同义词、分类以及在某种程度上提供英语单词的“语义”。这是探索数据库的web interface。这是via NLTK的使用方法。

示例：

from nltk.corpus import wordnet as wn

# get all possible meanings of a word. e.g. "welcome" has two possible meanings as a noun, three meanings as a verb and one meaning as an adjective    
wn.synsets('welcome')
# output: [Synset('welcome.n.01'), Synset('welcome.n.02'), Synset('welcome.v.01'), Synset('welcome.v.02'), Synset('welcome.v.03'), Synset('welcome.a.01')]

# get the definition of one of these meanings:
wn.synset('welcome.n.02').definition()
# output: 'a greeting or reception'

# get the hypernym of the specific meaning, i.e. the more abstract category it belongs to
wn.synset('welcome.n.02').hypernyms()
# output: [Synset('greeting.n.01')]

2。零样本分类

HuggingFaceTransformers 和零样本分类：您还可以使用预训练的深度学习模型对文本进行分类。在这种情况下，您需要为您在文本中寻找的所有可能的不同含义手动创建标签。例如：[“问候”、“侮辱”、“祝贺”]。然后，您可以使用深度学习模型来预测哪个标签（广义上的“语义”）最适合您的文本。

示例：

# pip install transformers==3.1.0  # pip install in terminal
from transformers import pipeline

classifier = pipeline("zero-shot-classification")

sequence = "Hi, I welcome you to this event"
candidate_labels = ["greeting", "insult", "congratulation"]

classifier(sequence, candidate_labels)

# output: {'sequence': 'Hi, I welcome you to this event',
# 'labels': ['greeting', 'congratulation', 'insult'],
# 'scores': [0.9001138210296631, 0.09858417510986328, 0.001302019809372723]}

=> 您的每个标签都获得了分数，得分最高的标签将是您的文本的“语义”。

这是一个交互式web application，可以查看库在没有编码的情况下做了什么。这是一个Jupyter notebook，它演示了如何在 Python 中使用它。您可以从笔记本中复制粘贴代码。

【讨论】：

您好，不是 OP，但想发表评论表示感谢。我目前正在尝试实施某种语义分析，以识别句子何时是分析性、矛盾性或综合性的。零样本学习和 WorldNet 以及用于标注的 PropBank 语料库都是很好的起点。你对如何识别这些句子@Moritz 有什么建议吗？
“分析的、矛盾的、综合的”类是基于什么的？这是基于特定的文献和/或训练数据集，还是您为您的用例提出的类别？

【解决方案2】：

您没有表现出任何努力编写自己的代码，但这里有一个小例子。

words = ['hello','hi','good morning']
x = input('Word here: ')
if x.lower() in words:
      print('Regards')

【讨论】：