r 情绪分析应用于整列答案

【问题标题】：r sentiment analysis applied to a whole columnr 情绪分析应用于整列
【发布时间】：2020-03-10 11:53:00
【问题描述】：

我有一个推文数据框。给定的推文有多个句子。当我使用感测者的感悟函数时，它会为每个函数返回一个分数，如下所示：

sentiment(as.character(tweets$text[1]))$sentiment
>>> [1] 0.2474874 0.0000000

但如果我想为整条推文打一个分数，我可以~通过取平均分来实现这个效果

mean(sentiment(as.character(tweets$text[1]))$sentiment)
>>>[1] 0.1237437

所以，我想我可以将相同的逻辑应用于整个数据帧

tweets$sentiment <- mean(sentiment(as.character((tweets$text)))$sentiment)

但是...这将为所有推文返回相同的值。如果我放弃mean()，我会得到NULL，因为有太多句子/乐谱需要解压。

如何为数据框的每一行分配一个值？

【问题讨论】：

标签： r dataframe sentimentr

【解决方案1】：

我们可以使用sapply 将sentiment 函数分别应用于每个text。

library(sentimentr)

tweets$text <- as.character(tweets$text)
tweets$sentiment_score <- sapply(tweets$text, function(x) 
                             mean(sentiment(x)$sentiment))

【讨论】：

工作就像一个魅力，我对 R 很陌生 - 谢谢！

【解决方案2】：

如果您更喜欢感性/整洁的方式，您可以执行以下操作。 get_sentences() 将每条推文分成句子。然后，您使用sentiment_by()。在这里，我使用id 作为分组变量并获取每条推文的平均情绪得分。

library(magrittr)
library(dplyr)

mytweets <- tibble(id = 1:3,
                   mytext = c("do you like it?  But I hate really bad dogs",
                              "I think the sentimentr package is great. But I need to learn how to use it",
                              "Do you like data science? I do!"))

mutate(mytweets,
      sentence_split = get_sentences(mytext)) %$%
sentiment_by(sentence_split, list(id))

   id word_count        sd ave_sentiment
1:  1         10 1.4974654    -0.8088680
2:  2         16 0.2906334     0.3944911
3:  3          7 0.1581139     0.1220192

【讨论】：