【问题标题】:Sentiment Analysis(SentiWordNet) - Judging the context of a sentence情感分析(SentiWordNet)——判断一个句子的上下文
【发布时间】:2013-03-17 17:34:54
【问题描述】:

我试图通过以下步骤找出一个句子是肯定的还是否定的:

1.) 使用斯坦福 NLP 解析器从句子中检索词性(动词、名词、形容词等)。

2.) 使用 SentiWordNet 找到与每个词性相关的正值和负值。

3.) 将得到的 Positive 和 Negative 值相加,计算出与句子相关的 Net PositiveNet Negative 值。

但问题在于,SentiWordNet 根据不同的感觉/上下文返回一个正/负值列表。是否可以将特定句子与词性一起传递给 SentiWordNet 解析器,以便它可以自动判断语义/上下文并仅返回一对正负值对?

或者这个问题还有其他替代解决方案吗?

谢谢。

【问题讨论】:

    标签: nlp stanford-nlp wordnet sentiment-analysis


    【解决方案1】:

    我们可以将 pos 传递给 sentiwordnet 解析器。 下载模式python模块

    from pattern.en import wordnet
    
    print wordnet.synsets("kill",pos="VB")[0].weight
    

    wordnet.synsets 返回同义词列表 从中我们选择第一项 输出将是(极性,主观性)的元组 希望这会有所帮助...

    【讨论】:

      【解决方案2】:

      SentoWordNet Demo Code 这可能会对您有所帮助。

      //    Copyright 2013 Petter Törnberg
      //
      //    This demo code has been kindly provided by Petter Törnberg <pettert@chalmers.se>
      //    for the SentiWordNet website.
      //
      //    This program is free software: you can redistribute it and/or modify
      //    it under the terms of the GNU General Public License as published by
      //    the Free Software Foundation, either version 3 of the License, or
      //    (at your option) any later version.
      //
      //    This program is distributed in the hope that it will be useful,
      //    but WITHOUT ANY WARRANTY; without even the implied warranty of
      //    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
      //    GNU General Public License for more details.
      //
      //    You should have received a copy of the GNU General Public License
      //    along with this program.  If not, see <http://www.gnu.org/licenses/>.
      
      import java.io.BufferedReader;
      import java.io.FileReader;
      import java.io.IOException;
      import java.util.HashMap;
      import java.util.Map;
      
      public class SentiWordNetDemoCode {
      
          private Map<String, Double> dictionary;
      
          public SentiWordNetDemoCode(String pathToSWN) throws IOException {
              // This is our main dictionary representation
              dictionary = new HashMap<String, Double>();
      
              // From String to list of doubles.
              HashMap<String, HashMap<Integer, Double>> tempDictionary = new HashMap<String, HashMap<Integer, Double>>();
      
              BufferedReader csv = null;
              try {
                  csv = new BufferedReader(new FileReader(pathToSWN));
                  int lineNumber = 0;
      
                  String line;
                  while ((line = csv.readLine()) != null) {
                      lineNumber++;
      
                      // If it's a comment, skip this line.
                      if (!line.trim().startsWith("#")) {
                          // We use tab separation
                          String[] data = line.split("\t");
                          String wordTypeMarker = data[0];
      
                          // Example line:
                          // POS ID PosS NegS SynsetTerm#sensenumber Desc
                          // a 00009618 0.5 0.25 spartan#4 austere#3 ascetical#2
                          // ascetic#2 practicing great self-denial;...etc
      
                          // Is it a valid line? Otherwise, through exception.
                          if (data.length != 6) {
                              throw new IllegalArgumentException(
                                      "Incorrect tabulation format in file, line: "
                                              + lineNumber);
                          }
      
                          // Calculate synset score as score = PosS - NegS
                          Double synsetScore = Double.parseDouble(data[2])
                                  - Double.parseDouble(data[3]);
      
                          // Get all Synset terms
                          String[] synTermsSplit = data[4].split(" ");
      
                          // Go through all terms of current synset.
                          for (String synTermSplit : synTermsSplit) {
                              // Get synterm and synterm rank
                              String[] synTermAndRank = synTermSplit.split("#");
                              String synTerm = synTermAndRank[0] + "#"
                                      + wordTypeMarker;
      
                              int synTermRank = Integer.parseInt(synTermAndRank[1]);
                              // What we get here is a map of the type:
                              // term -> {score of synset#1, score of synset#2...}
      
                              // Add map to term if it doesn't have one
                              if (!tempDictionary.containsKey(synTerm)) {
                                  tempDictionary.put(synTerm,
                                          new HashMap<Integer, Double>());
                              }
      
                              // Add synset link to synterm
                              tempDictionary.get(synTerm).put(synTermRank,
                                      synsetScore);
                          }
                      }
                  }
      
                  // Go through all the terms.
                  for (Map.Entry<String, HashMap<Integer, Double>> entry : tempDictionary
                          .entrySet()) {
                      String word = entry.getKey();
                      Map<Integer, Double> synSetScoreMap = entry.getValue();
      
                      // Calculate weighted average. Weigh the synsets according to
                      // their rank.
                      // Score= 1/2*first + 1/3*second + 1/4*third ..... etc.
                      // Sum = 1/1 + 1/2 + 1/3 ...
                      double score = 0.0;
                      double sum = 0.0;
                      for (Map.Entry<Integer, Double> setScore : synSetScoreMap
                              .entrySet()) {
                          score += setScore.getValue() / (double) setScore.getKey();
                          sum += 1.0 / (double) setScore.getKey();
                      }
                      score /= sum;
      
                      dictionary.put(word, score);
                  }
              } catch (Exception e) {
                  e.printStackTrace();
              } finally {
                  if (csv != null) {
                      csv.close();
                  }
              }
          }
      
          public double extract(String word, String pos) {
              return dictionary.get(word + "#" + pos);
          }
      
          public static void main(String [] args) throws IOException {
              if(args.length<1) {
                  System.err.println("Usage: java SentiWordNetDemoCode <pathToSentiWordNetFile>");
                  return;
              }
      
              String pathToSWN = args[0];
              SentiWordNetDemoCode sentiwordnet = new SentiWordNetDemoCode(pathToSWN);
      
              System.out.println("good#a "+sentiwordnet.extract("good", "a"));
              System.out.println("bad#a "+sentiwordnet.extract("bad", "a"));
              System.out.println("blue#a "+sentiwordnet.extract("blue", "a"));
              System.out.println("blue#n "+sentiwordnet.extract("blue", "n"));
          }
      }
      

      【讨论】:

      • 您能否详细说明作为函数sentiwordnet.extract() 的结果返回的值(可能有一个例子)?
      猜你喜欢
      • 1970-01-01
      • 2015-04-11
      • 2020-09-28
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 2011-04-04
      • 1970-01-01
      相关资源
      最近更新 更多