【发布时间】:2014-12-21 16:23:52
【问题描述】:
我想创建一个情感分析程序,它接收中文数据集,并确定是否有更多的正面、负面或中性陈述。按照这个例子,我为英语(stanford-corenlp)创建了一个情感分析,它完全符合我的要求,但采用了中文。
问题:
Properties props = new Properties();
props.setProperty("annotators", "tokenize, ssplit, parse, sentiment");
// gender,lemma,ner,parse,pos,sentiment,sspplit, tokenize
StanfordCoreNLP pipeline = new StanfordCoreNLP(props);
// read some text in the text variable
String sentimentText = "Fun day, isn't it?";
String[] ratings = {"Very Negative","Negative", "Neutral", "Positive", "Very Positive"};
Annotation annotation = pipeline.process(sentimentText);
for (CoreMap sentence : annotation.get(CoreAnnotations.SentencesAnnotation.class)) {
Tree tree = sentence.get(SentimentCoreAnnotations.AnnotatedTree.class);
int score = RNNCoreAnnotations.getPredictedClass(tree);
System.out.println("sentence:'"+ sentence + "' has a score of "+ (score-2) +" rating: " + ratings[score]);
System.out.println(tree);
目前,我不知道如何更改上述代码以使其支持中文。我下载了中文praser和segmenter并看到了demo。但经过几天的尝试,它没有导致任何地方。我也看过http://nlp.stanford.edu/software/corenlp.shtml,英文版真的很有用。是否有任何电子书、教程或示例可以帮助我理解斯坦福 NLP 的中文情感分析是如何工作的?
提前致谢!
PS:我不久前接触了java,如果有一些我没有说或做对的事情,请原谅我。
我研究了什么:
How to parse languages other than English with Stanford Parser? in java, not command lines
【问题讨论】:
标签: java dataset stanford-nlp sentiment-analysis