Solr：停用词必须重新索引吗？答案

【问题标题】：Solr: Is re-indexing is must for stop-words?Solr：停用词必须重新索引吗？
【发布时间】：2026-01-07 09:25:02
【问题描述】：

如果我们在 stopwords.txt 文件中添加停用词而不重新索引文档，Solr 4.10.3 是否会从查询短语中消除停用词？还是必须重新索引文件？

因为我添加了停用词（没有重新索引文档）并且 solr 仍然给我结果而不消除停用词。

在 stopwords.txt 文件中添加列表后，我已经重新启动了 solr

<fieldType name="text_general" class="solr.TextField" positionIncrementGap="100">
  <analyzer type="index">
<similarity class="solr.DFRSimilarityFactory">
        <str name="basicModel">I(F)</str>
        <str name="afterEffect">B</str>
        <str name="normalization">H2</str>
    </similarity>
    <tokenizer class="solr.StandardTokenizerFactory"/>
    <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" enablePositionIncrements="true" />
    <!-- in this example, we will only use synonyms at query time
                 <filter class="solr.SynonymFilterFactory" synonyms="index_synonyms.txt" ignoreCase="true" expand="false"/>
    -->
    <filter class="solr.LowerCaseFilterFactory"/>
  </analyzer>
  <analyzer type="query">
    <tokenizer class="solr.StandardTokenizerFactory"/>
    <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" enablePositionIncrements="true" />
    <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="true"/>
    <filter class="solr.LowerCaseFilterFactory"/>
  </analyzer>
</fieldType>

【问题讨论】：

字段内容被存储，这就是你获得整个数据的原因。当您使用停用词列表时，它不会从您的数据中删除停用词，而是不会索引停用词。
好吧，我认为查询分析器会从我们的查询字符串中删除关键词，因此它不会搜索停用词，这就是为什么有停用词过滤器的原因，但我不确定。

标签： solr

【解决方案1】：

考虑查询 q=印度钢铁侠

如果您在查询分析器中使用停用词并说出单词 of 包含在停用词列表中。 solr 将分隔标记如下

Iron, man, of, India

由于您使用了停用词过滤器，它会丢弃单词 "of" 并搜索具有标记的文档（铁、人、印度）。结果文档分数取决于各种因素，例如文档中存在多少令牌，它存在的时间（tf-IDF分数）

在索引期间使用停用词时也是如此。它将索引令牌（铁、人、印度）它不会索引（的）。

【讨论】：