从 Penn Treebank 格式的文本中提取子句答案

【问题标题】：Extracting clause from a Penn Treebank-formatted text从 Penn Treebank 格式的文本中提取子句
【发布时间】：2012-05-05 02:57:36
【问题描述】：

说我有一句话：

After he had eaten the cheese, Bill went to the grocery.

在我的程序中，我得到以下输出：

---PARSE TREE---
(ROOT
  (S
    (SBAR (IN After)
      (S
        (NP (PRP he))
        (VP (VBD had)
          (VP (VBN eaten)
            (NP (DT the) (NN cheese))))))
    (, ,)
    (NP (NNP Bill))
    (VP (VBD went)
      (PP (TO to)
        (NP (DT the) (NN grocery))))
    (. .)))

我如何将不在一个子句中的东西合并成一个独立的子句？像这样：

S Clause {
    SBAR Clause {
         After he had eaten the cheese,
    }

    S Clause {
        Bill went to the grocery.
    }
}

我很确定我不清楚，但基本上我想提取句子的独立和从属子句，以及这些子句的子句。

【问题讨论】：

看到这个答案：stackoverflow.com/a/10401824/109618

标签： nlp stanford-nlp

【解决方案1】：

这是来自 NLTK 指南的演示代码（它没有明确显示如何提取子句）： http://nltk.googlecode.com/svn/trunk/doc/howto/tree.html

【讨论】：

我不明白这是如何回答这个问题的。 NLTK 是一个 Python 工具。该问题被标记为关于斯坦福 NLP。
David James-NLTK 是一个 Python 工具，用于处理在斯坦福 NLP 输出的结构中格式化的数据。有“语料库阅读器”类可以处理诸如 Penn Treebank 之类的格式。 nltk.googlecode.com/svn/trunk/doc/howto/corpus.html
此答案中的链接现在受密码保护
用户名和密码是什么？