斯坦福 NLP 服务器中的自定义 NER答案

【问题标题】：Customize NER in StanfordNLP Server斯坦福 NLP 服务器中的自定义 NER
【发布时间】：2026-01-30 00:15:01
【问题描述】：

您好，我正在尝试将其他实体添加到当前的默认规则中。它适用于stanfordNLP中的txt.file，但是当我使用stanfordNLPServer在python中应用时，它无法覆盖默认规则。

我在 Python 中使用 coreNLP 的 NLTK 包装器，输入文本是数据框中的一列。默认规则运行良好，但无法添加自定义规则。

适用于 StanfordCoreNLP 的 Java 命令：

'java -Xmx4g -cp "*" edu.stanford.nlp.pipeline.StanfordCoreNLP -annotators tokenize,ssplit,pos,lemma,ner -ner.additional.regexner.mapping extra.txt -file example.txt -outputFormat文字'

但是当我运行以下 StanfordCoreNLPServer 命令时它失败了：

'java -Xmx4g -cp "*" edu.stanford.nlp.pipeline.StanfordCoreNLPServer -annotators tokenize,ssplit,pos,lemma,ner -ner.additional.regexner.mapping extra.txt -status_port 9000 -port 9000 -超时 90000 &'

我的猜测是 coreNLPServer 还不能自定义实体，但不确定。如果有人可以提供帮助，我将不胜感激！

【问题讨论】：

标签： stanford-nlp

【解决方案1】：

您需要使用-serverProperties 选项并设置一个文件，其中包含您想要用于管道的所有属性。

您不能直接向服务器提交管道属性。

例如，创建一个名为 server.props 的文件。

在该文件中放置您的属性：

annotators = tokenize,ssplit,pos,lemma,ner
ner.additional.regexner.mapping = extra.txt

然后运行这个命令：

java -Xmx4g edu.stanford.nlp.pipeline.StanfordCoreNLPServer -port 9000 -timeout 15000 -serverProperties server.props

【讨论】：