【发布时间】:2018-09-24 15:00:34
【问题描述】:
我看到每个句子的情绪得分非常好的 R 脚本,可在:sentiment.R 获得,我想知道如何替换这部分
# split into words. str_split is in the stringr package
word.list = str_split(sentence, '\\s+')
# sometimes a list() is one level of hierarchy too much
words = unlist(word.list)
用于匹配多个词与 pos 和 neg 字典与多个词。我有一个下面的例子。
我有以下data.frame:
sent <- data.frame(words = c("just right size", "love this quality",
"good quality", "very good quality", "i hate this notebook",
"great improvement", "notebook is not good","notebook was"), user = c(1,2,3,4,5,6,7,8))
words user
1 just right size 1
2 love this quality 2
3 good quality 3
4 very good quality 4
5 i hate this notebook 5
6 great improvement 6
7 notebook is not good 7
8 notebook was 8
然后我有正负词的词典:
posWord <- c("great","improvement","love","great improvement","very good","good","right","very")
negWords <- c("hate","bad","not good","horrible")
所需的输出如下:
words user SentimentScore
1 just right size 1 1
2 love this quality 2 1
3 good quality 3 1
4 very good quality 4 1
5 i hate this notebook 5 -1
6 great improvement 6 1
7 notebook is not good 7 -1
8 notebook was 8 0
我应该如何在 github 上重写该代码以获得所需的输出。我的意思是,如果我按原样使用 github 上的源代码,那么例如在第 4 行,SentimentScore 列中将有 2 而不是 1。
请问有人对此有任何建议或类似的解决方案吗?我会感谢你的任何帮助。非常感谢您。
【问题讨论】:
-
好的,这真是完美的解决方案 :-) 很抱歉,但我已经更新了任务...如果我在八行中没有匹配并且 SentimentScore 结果为零怎么办。
-
如果没有匹配,SentimentScore 将为零。
标签: r