【问题标题】:Add stopwords to a standard azure search analyzer?向标准的 Azure 搜索分析器添加停用词?
【发布时间】:2018-10-30 11:50:41
【问题描述】:

我在 Azure 搜索索引中使用 en.microsoft 分析器。在大多数情况下,它运行良好,但我需要添加一些特定于域的停用词。有没有办法在现有的分析器中添加停用词?还是要实现一个自定义分析器,它从标准分析器继承其行为,并只覆盖停用词,同时保持其他所有内容不变?

【问题讨论】:

    标签: azure-cognitive-search


    【解决方案1】:

    虽然您不能从现有分析器继承,但您可以创建一对custom analyzers(一个用于索引,一个用于搜索),其功能等同于en.microsoft,但具有您自己的停用词列表。以下是它在 REST API 的索引定义有效负载中的外观:

    {
      ...
      "analyzers": [
        {
          "@odata.type": "#Microsoft.Azure.Search.CustomAnalyzer",
          "name": "my_search_analyzer",
          "tokenizer": "my_english_search_tokenizer",
          "tokenFilters": [ "my_asciifolding_search", "lowercase", "my_stopword_filter" ]
        },
        {
          "@odata.type": "#Microsoft.Azure.Search.CustomAnalyzer",
          "name": "my_index_analyzer",
          "tokenizer": "my_english_index_tokenizer",
          "tokenFilters": [ "my_asciifolding_index", "lowercase", "my_stopword_filter" ]
        }
      ],
      "tokenizers": [
        {
          "name": "my_english_search_tokenizer",
          "@odata.type": "#Microsoft.Azure.Search.MicrosoftLanguageStemmingTokenizer",
          "isSearchTokenizer": true,
          "language": "english"
        },
        {
          "name": "my_english_index_tokenizer",
          "@odata.type": "#Microsoft.Azure.Search.MicrosoftLanguageStemmingTokenizer",
          "isSearchTokenizer": false,
          "language": "english"
        }
      ],
      "tokenFilters": [
        {
          "name": "my_asciifolding_search",
          "@odata.type": "#Microsoft.Azure.Search.AsciiFoldingTokenFilter",
          "preserveOriginal": false
        },
        {
          "name": "my_asciifolding_index",
          "@odata.type": "#Microsoft.Azure.Search.AsciiFoldingTokenFilter",
          "preserveOriginal": true
        },
        {
          "name": "my_stopword_filter",
          "@odata.type": "#Microsoft.Azure.Search.StopwordsTokenFilter",
          "stopwords": [ "put", "your", "custom", "stopwords", "here" ]
        }
      ]
    }
    

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 2021-11-13
      • 2014-12-02
      • 1970-01-01
      • 1970-01-01
      相关资源
      最近更新 更多