【问题标题】:Elasticsearch wildcard query with spaces带空格的 Elasticsearch 通配符查询
【发布时间】:2020-04-16 18:53:31
【问题描述】:

我正在尝试使用空格进行通配符查询。它很容易根据术语而不是字段匹配单词。

我已阅读文档说我需要将该字段设置为 not_analyzed 但设置此类型时,它什么也不返回。

这是基于术语的映射:

{
  "denshop" : {
    "mappings" : {
      "products" : {
        "properties" : {
          "code" : {
            "type" : "string"
          },
          "id" : {
            "type" : "long"
          },
          "name" : {
            "type" : "string"
          },
          "price" : {
            "type" : "long"
          },
          "url" : {
            "type" : "string"
          }
        }
      }
    }
  }
}

这是完全相同的查询不返回任何内容的映射:

{
  "denshop" : {
    "mappings" : {
      "products" : {
        "properties" : {
          "code" : {
            "type" : "string"
          },
          "id" : {
            "type" : "long"
          },
          "name" : {
            "type" : "string",
            "index" : "not_analyzed"
          },
          "price" : {
            "type" : "long"
          },
          "url" : {
            "type" : "string"
          }
        }
      }
    }
  }
}

查询在这里:

curl -XPOST http://127.0.0.1:9200/denshop/products/_search?pretty -d '{"query":{"wildcard":{"name":"*test*"}}}'

使用not_analyzed 属性响应:

{
  "took" : 3,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "failed" : 0
  },
  "hits" : {
    "total" : 0,
    "max_score" : null,
    "hits" : [ ]
  }
}

没有未分析的响应:

{
  "took" : 4,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "failed" : 0
  },
  "hits" : {
    "total" : 5,
    "max_score" : 1.0,
    "hits" : [ {
    ...

编辑:添加请求的信息

这里是文件列表:

{
  "took" : 2,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "failed" : 0
  },
  "hits" : {
    "total" : 5,
    "max_score" : 1.0,
    "hits" : [ {
      "_index" : "denshop",
      "_type" : "products",
      "_id" : "3L1",
      "_score" : 1.0,
      "_source" : {
        "id" : 3,
        "name" : "Testovací produkt 2",
        "code" : "",
        "price" : 500,
        "url" : "http://www.denshop.lh/damske-obleceni/testovaci-produkt-2/"
      }
    }, {
      "_index" : "denshop",
      "_type" : "products",
      "_id" : "4L1",
      "_score" : 1.0,
      "_source" : {
        "id" : 4,
        "name" : "Testovací produkt 3",
        "code" : "",
        "price" : 666,
        "url" : "http://www.denshop.lh/damske-obleceni/testovaci-produkt-3/"
      }
    }, {
      "_index" : "denshop",
      "_type" : "products",
      "_id" : "2L1",
      "_score" : 1.0,
      "_source" : {
        "id" : 2,
        "name" : "Testovací produkt",
        "code" : "",
        "price" : 500,
        "url" : "http://www.denshop.lh/damske-obleceni/testovaci-produkt/"
      }
    }, {
      "_index" : "denshop",
      "_type" : "products",
      "_id" : "5L1",
      "_score" : 1.0,
      "_source" : {
        "id" : 5,
        "name" : "Testovací produkt 4",
        "code" : "",
        "price" : 666,
        "url" : "http://www.denshop.lh/damske-obleceni/testovaci-produkt-4/"
      }
    }, {
      "_index" : "denshop",
      "_type" : "products",
      "_id" : "6L1",
      "_score" : 1.0,
      "_source" : {
        "id" : 6,
        "name" : "Testovací produkt 5",
        "code" : "",
        "price" : 666,
        "url" : "http://www.denshop.lh/tricka-tilka-tuniky/testovaci-produkt-5/"
      }
    } ]
  }
}

没有not_analyzed,它会返回:

curl -XPOST http://127.0.0.1:9200/denshop/products/_search?pretty -d '{"query":{"wildcard":{"name":"*testovací*"}}}'

但不是这个(注意星号前的空格):

curl -XPOST http://127.0.0.1:9200/denshop/products/_search?pretty -d '{"query":{"wildcard":{"name":"*testovací *"}}}'

当我将 not_analyzed 添加到映射时,无论我在通配符查询中输入什么,它都不会返回任何命中。

【问题讨论】:

  • 哪些文件不匹配并且应该匹配?请举个例子。
  • 用请求的数据更新了问题。
  • 您的文档有大写字母,使用not_analyzed,它们将像这样被索引。当你搜索testovaci(意思是小写字母)当然不会匹配大写的Testovaci
  • 谢谢!是否可以让它匹配该字段不区分大小写?或者我只能拥有这两个功能之一?

标签: elasticsearch elasticsearch-query


【解决方案1】:

添加一个应该小写文本的自定义分析器。然后在您的搜索查询中,在将文本传递给它之前在您的客户端应用程序中将其小写

为了同时保留原始分析链,我在您的 name 字段中添加了一个子字段,它将使用自定义分析器。

PUT /denshop
{
  "settings": {
    "analysis": {
      "analyzer": {
        "keyword_lowercase": {
          "type": "custom",
          "tokenizer": "keyword",
          "filter": [
            "lowercase"
          ]
        }
      }
    }
  },
  "mappings": {
    "products": {
      "properties": {
        "name": {
          "type": "string",
          "fields": {
            "lowercase": {
              "type": "string",
              "analyzer": "keyword_lowercase"
            }
          }
        }
      }
    }
  }
}

查询将作用于子字段:

GET /denshop/products/_search
{
  "query": {
    "wildcard": {
      "name.lowercase": "*testovací *"
    }
  }
}

【讨论】:

    猜你喜欢
    • 2015-06-26
    • 1970-01-01
    • 2012-04-18
    • 2016-10-24
    • 2016-04-25
    • 2012-11-28
    • 1970-01-01
    • 1970-01-01
    • 2016-08-15
    相关资源
    最近更新 更多