【问题标题】:Grammar rule extraction from parsed result从解析结果中提取语法规则
【发布时间】:2015-10-15 05:59:17
【问题描述】:

当我从 nltk 执行 stanford 解析器时,我得到以下结果。

(S (VP (VB get) (NP (PRP me)) (ADVP (RB now))))

但我需要它的形式

S -> VP
VP -> VB NP ADVP
VB -> get
PRP -> me
RB -> now

我怎样才能得到这个结果,也许是使用递归函数。 是否已经有内置功能?

【问题讨论】:

    标签: python recursion nltk stanford-nlp


    【解决方案1】:

    首先导航一棵树,参见How to iterate through all nodes of a tree?How to navigate a nltk.tree.Tree?

    >>> from nltk.tree import Tree
    >>> bracket_parse = "(S (VP (VB get) (NP (PRP me)) (ADVP (RB now))))"
    >>> ptree = Tree.fromstring(bracket_parse)
    >>> ptree
    Tree('S', [Tree('VP', [Tree('VB', ['get']), Tree('NP', [Tree('PRP', ['me'])]), Tree('ADVP', [Tree('RB', ['now'])])])])
    >>> for subtree in ptree.subtrees():
    ...     print subtree
    ... 
    (S (VP (VB get) (NP (PRP me)) (ADVP (RB now))))
    (VP (VB get) (NP (PRP me)) (ADVP (RB now)))
    (VB get)
    (NP (PRP me))
    (PRP me)
    (ADVP (RB now))
    (RB now)
    

    而你要找的是https://github.com/nltk/nltk/blob/develop/nltk/tree.py#L341:

    >>> ptree.productions()
    [S -> VP, VP -> VB NP ADVP, VB -> 'get', NP -> PRP, PRP -> 'me', ADVP -> RB, RB -> 'now']
    

    注意Tree.productions() 返回一个Production 对象,请参阅https://github.com/nltk/nltk/blob/develop/nltk/tree.py#L22https://github.com/nltk/nltk/blob/develop/nltk/grammar.py#L236

    如果你想要一个字符串形式的语法规则,你可以这样做:

    >>> for rule in ptree.productions():
    ...     print rule
    ... 
    S -> VP
    VP -> VB NP ADVP
    VB -> 'get'
    NP -> PRP
    PRP -> 'me'
    ADVP -> RB
    RB -> 'now'
    

    或者

    >>> rules = [str(p) for p in ptree.productions()]
    >>> rules
    ['S -> VP', 'VP -> VB NP ADVP', "VB -> 'get'", 'NP -> PRP', "PRP -> 'me'", 'ADVP -> RB', "RB -> 'now'"]
    

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 2013-08-27
      • 1970-01-01
      • 1970-01-01
      • 2012-06-21
      • 1970-01-01
      • 1970-01-01
      相关资源
      最近更新 更多