【发布时间】:2016-12-17 22:39:58
【问题描述】:
我对 GATE 中的 Sentence Splitter 模块有疑问。我的文字是这样的:
Social history. He drank a lot in his young age. He did
not attend a school. He was depressed of his condition.
虽然我们确定句子应该像这样拆分
Sentence 1: Social history.
Sentence 2: He drank a lot in his young age.
Sentence 3: He did not attend a school.
Sentence 4: He was depressed of his condition.
ANNIE Sentence Splitter 识别出不同行中的文本应该被分组到不同的句子中,因此结果如下:
Sentence 1: Social history.
Sentence 2: He drank a lot in his young age.
Sentence 3: He did
Sentence 4: not attend a school.
Sentence 5: He was depressed of his condition.
那是因为句子被分成了多行。有没有办法告诉句子拆分器该句子可能不止一行?或者有没有更好的方法来识别此类文本中的句子?
谢谢你:)
【问题讨论】:
-
您可能正在将单行传递给句子拆分器。您应该首先阅读完整的文件并将完整的文本传递给句子拆分器。
-
其实我用的是GATE Developer,所以我想我一下子把所有的句子都传完了@RAVI
标签: nlp gate java-annotations