【问题标题】:Percentage Count Verb, Noun using Spacy?百分比计数动词,名词使用 Spacy?
【发布时间】:2019-01-12 02:10:50
【问题描述】:

我想用spacy计算句子中POS的百分比分割,类似于

Count verbs, nouns, and other parts of speech with python's NLTK

目前能够检测和计数 POS。如何找到百分比分割。

from __future__ import unicode_literals
import spacy,en_core_web_sm
from collections import Counter
nlp = en_core_web_sm.load()
print Counter(([token.pos_ for token in nlp('The cat sat on the mat.')]))

当前输出:

Counter({u'NOUN': 2, u'DET': 2, u'VERB': 1, u'ADP': 1, u'PUNCT': 1})

预期输出:

Noun: 28.5%
DET: 28.5%
VERB: 14.28%
ADP: 14.28%
PUNCT: 14.28%

如何将输出写入 pandas 数据框?

【问题讨论】:

    标签: pandas nlp spacy


    【解决方案1】:

    这些方面的东西应该可以满足您的需求:

    sbase = sum(c.values())
    
    for el, cnt in c.items():
        print(el, '{0:2.2f}%'.format((100.0* cnt)/sbase))
    
    
    NOUN 28.57%
    DET 28.57%
    VERB 14.29%
    ADP 14.29%
    PUNCT 14.29%
    

    【讨论】:

      【解决方案2】:
      from __future__ import unicode_literals
      import spacy,en_core_web_sm
      from collections import Counter
      nlp = en_core_web_sm.load()
      c = Counter(([token.pos_ for token in nlp('The cat sat on the mat.')]))
      sbase = sum(c.values())
      for el, cnt in c.items():
          print(el, '{0:2.2f}%'.format((100.0* cnt)/sbase))
      

      输出:

      (u'NOUN', u'28.57%')
      (u'VERB', u'14.29%')
      (u'DET', u'28.57%')
      (u'ADP', u'14.29%')
      (u'PUNCT', u'14.29%')
      

      【讨论】:

        猜你喜欢
        • 1970-01-01
        • 1970-01-01
        • 1970-01-01
        • 1970-01-01
        • 1970-01-01
        • 1970-01-01
        • 1970-01-01
        • 2016-12-30
        • 2017-08-14
        相关资源
        最近更新 更多