如何稍微修改此代码以在 R 中生成正确的词云？答案

【问题标题】：How to amend this code slightly to produce correct word cloud in R?如何稍微修改此代码以在 R 中生成正确的词云？
【发布时间】：2021-01-08 14:31:38
【问题描述】：

假设我们有一个包含 cmets 的数据框 (df)（每一行都是注释）：

comment
Amazing job
Terrible work

我们有一本正反两字词典 (dict)：

positive negative
amazing  terrible

我正在尝试创建两个词云：df 中的正面评论之一，df 中的负面评论之一。为此，我尝试了以下代码，但遇到了错误。任何人都可以提出解决方案吗？

library("quanteda")

corpus_example <- corpus(df)
head(corpus_example)

Output:

text1:
"Amazing job"

text2:
"Terrible work"

接下来，创建 dfm：

comments_dfm <- dfm(corpus_example, dictionary = dict)
head(comments_dfm)

Output:
      positive negative
text1 1        0
text2 0        1

即它显示了text1 和text2 中存在多少积极和消极的词（根据dict）。 text1 被认为是积极的，text2 被认为是消极的。

最后，我尝试使用textplot_wordcloud(comments_dfm) 创建词云，但这只是返回一个包含comments_dfm 标题的词云，即词positive 和negative。相反，我想要两个词云：一个包含Amazing job（因为它被认为是正面评论），另一个包含Terrible work（因为它是负面评论）。

有谁知道如何解决这个问题？

【问题讨论】：

标签： r sentiment-analysis word-cloud

【解决方案1】：

有几件事：

出现positive 和negative 的原因是因为您已将Amazing job 和Terrible work“映射”到这些各自的类别。我们使用字典将原始文本与不同的类别对应起来，以便我们可以以有意义的方式解释数据（例如，分析词频以了解情绪）。
但是，我认为您根本不需要 quanteda。请参阅下面的正面 wordcloud 示例
由于要保留短语，请使用table；见Creating "word" cloud of phrases, not individual words in R

library(wordcloud)

df <- data.frame(comment = c("Amazing job",
                             "Terrible work",
                             "Great job",
                             "Great job",
                             "Great job",
                             "Great job",
                             "Fantastic job",
                             "Fantastic job",
                             "Fantastic job",
                             "Amazing job",
                             "Amazing job",
                             "Terrible work",
                             "Terrible work"))

dict <- list(
  Positive = c("Amazing","Great","Fantastic"),
  Negative = c("Terrible","Bad","Suck")
)

# finds positive comments / negative comments depending on input
find_matches <- function(comments,dictionary){
  comments[grepl(paste(dictionary,collapse = "|"),
                 comments,
                 ignore.case = TRUE)]
}

# Since you want phrases, using table
positive_table <- table(find_matches(df$comment, dict$Positive))
wordcloud::wordcloud(
  names(positive_table),
  as.numeric(positive_table),
  scale = c(2, 1),
  min.freq = 3,
  max.words = 100,
  random.order = T
)

【讨论】：