【问题标题】:Kibana: searching for a specific phrase, returns without results, while another search returns the phraseKibana:搜索特定短语,没有结果返回,而另一个搜索返回短语
【发布时间】:2015-11-28 22:09:11
【问题描述】:

看起来像一个简单的用例,但由于某种原因我无法弄清楚如何做到这一点,或者谷歌一个明确的例子。

假设我有一条消息存储在 logstash 留言:

“信息:2015-11-28 22:02:19,232:common:INFO:ENV: 生产 用户:无:用户名:无:LOG:发布到总线“

我想在 kibana(版本 4)中搜索短语:“publishing to bus” 我会得到一组结果 但如果我要搜索:“无:日志:发布到总线” 然后我得到“未找到结果”。

虽然很明显这个短语确实存在并且是由先前的搜索返回的。

所以我的问题基本上是 - 发生了什么事?搜索可能的长短语的正确方法是什么?为什么第二个示例会失败。

编辑: 存储的 JSON。

{
  "_index": "logz-ngdxrkmolklnvngumaitximbohqwbocg-151206_v1",
  "_type": "django_logger",
  "_id": "AVF2DPxZZst_8_8_m-se",
  "_score": null,
  "_source": {
    "log": " publishing to bus {'user_id': 8866, 'event_id': 'aibRBPcLxcAzsEVRtFZVU5', 'timestamp': 1449384441, 'quotes': {}, 'rates': {u'EURUSD': Decimal('1.061025'), u'GBPUSD': Decimal('1.494125'), u'EURGBP': Decimal('0.710150')}, 'event': 'AccountInstrumentsUpdated', 'minute': 1449384420}",
    "logger": "common",
    "log_level": "INFO",
    "message": "2015-12-06 06:47:21,298:common:INFO:ENV: Production User:None:Username:None:LOG: publishing to bus {'user_id': 8866, 'event_id': 'aibRBPcLxcAzsEVRtFZVU5', 'timestamp': 1449384441, 'quotes': {}, 'rates': {u'EURUSD': Decimal('1.061025'), u'GBPUSD': Decimal('1.494125'), u'EURGBP': Decimal('0.710150')}, 'event': 'AccountInstrumentsUpdated', 'minute': 1449384420}",
    "type": "django_logger",
    "tags": [
      "celery"
    ],
    "path": "//path/to/logs/out.log",
    "environment": "Staging",
    "@timestamp": "2015-12-06T06:47:21.298+00:00",
    "user_id": "None",
    "host": "path.to.host",
    "timestamp": "2015-12-06 06:47:21,298",
    "username": "None"
  },
  "fields": {
    "@timestamp": [
      1449384441298
    ]
  },
  "highlight": {
    "message": [
      "2015-12-06 06:47:21,298:common:INFO:ENV: Staging User:None:Username:None:LOG: @kibana-highlighted-field@publishing@/kibana-highlighted-field@ @kibana-highlighted-field@to@/kibana-highlighted-field@ @kibana-highlighted-field@bus@/kibana-highlighted-field@ {'user_id': **, 'event_id': 'aibRBPcLxcAzsEVRtFZVU5', 'timestamp': 1449384441, 'quotes': {}, 'rates': {u'EURUSD': Decimal('1.061025'), u'GBPUSD': Decimal('1.494125'), u'EURGBP': Decimal('0.710150')}, 'event': 'AccountInstrumentsUpdated', 'minute': 1449384420}"
    ]
  },
  "sort": [
    1449384441298
  ]
}

【问题讨论】:

  • 你试过:“none:log: publishing to bus”吗?
  • 现在试过了。不返回任何结果
  • 我相信: 是一个特殊字符,尝试转义它:\:
  • @xjedam Nope.. 试过了。没用
  • 这个字段的映射是什么样的?

标签: logstash kibana kibana-4


【解决方案1】:

根据 Elasticsearch,它默认使用 标准分析器。标准分析器标记消息字段如下:

"2015-12-06 06:47:21,298:common:INFO:ENV: 生产 用户:无:用户名:无:日志:发布到总线 {'user_id':8866, 'event_id':'aibRBPcLxcAzsEVRtFZVU5','时间戳':1449384441, 'quotes': {}, 'rates': {u'EURUSD': Decimal('1.061025'), u'GBPUSD': 十进制('1.494125'),u'EURGBP':十进制('0.710150')},'事件': 'AccountInstrumentsUpdated', '分钟': 1449384420}"

{
  "tokens": [
    {
      "token": "2015",
      "start_offset": 0,
      "end_offset": 4,
      "type": "<NUM>",
      "position": 0
    },
    {
      "token": "12",
      "start_offset": 5,
      "end_offset": 7,
      "type": "<NUM>",
      "position": 1
    },
    {
      "token": "06",
      "start_offset": 8,
      "end_offset": 10,
      "type": "<NUM>",
      "position": 2
    },
    {
      "token": "06",
      "start_offset": 11,
      "end_offset": 13,
      "type": "<NUM>",
      "position": 3
    },
    {
      "token": "47",
      "start_offset": 14,
      "end_offset": 16,
      "type": "<NUM>",
      "position": 4
    },
    {
      "token": "21,298",
      "start_offset": 17,
      "end_offset": 23,
      "type": "<NUM>",
      "position": 5
    },
    {
      "token": "common:info:env",
      "start_offset": 24,
      "end_offset": 39,
      "type": "<ALPHANUM>",
      "position": 6
    },
    {
      "token": "production",
      "start_offset": 41,
      "end_offset": 51,
      "type": "<ALPHANUM>",
      "position": 7
    },
    {
      "token": "user:none:username:none:log",
      "start_offset": 52,
      "end_offset": 79,
      "type": "<ALPHANUM>",
      "position": 8
    },
    {
      "token": "publishing",
      "start_offset": 81,
      "end_offset": 91,
      "type": "<ALPHANUM>",
      "position": 9
    },
    {
      "token": "to",
      "start_offset": 92,
      "end_offset": 94,
      "type": "<ALPHANUM>",
      "position": 10
    },
    {
      "token": "bus",
      "start_offset": 95,
      "end_offset": 98,
      "type": "<ALPHANUM>",
      "position": 11
    },
    {
      "token": "user_id",
      "start_offset": 100,
      "end_offset": 107,
      "type": "<ALPHANUM>",
      "position": 12
    },
    {
      "token": "8866",
      "start_offset": 109,
      "end_offset": 113,
      "type": "<NUM>",
      "position": 13
    },
    {
      "token": "event_id",
      "start_offset": 115,
      "end_offset": 123,
      "type": "<ALPHANUM>",
      "position": 14
    },
    {
      "token": "aibrbpclxcazsevrtfzvu5",
      "start_offset": 125,
      "end_offset": 147,
      "type": "<ALPHANUM>",
      "position": 15
    },
    {
      "token": "timestamp",
      "start_offset": 149,
      "end_offset": 158,
      "type": "<ALPHANUM>",
      "position": 16
    },
    {
      "token": "1449384441",
      "start_offset": 160,
      "end_offset": 170,
      "type": "<NUM>",
      "position": 17
    },
    {
      "token": "quotes",
      "start_offset": 172,
      "end_offset": 178,
      "type": "<ALPHANUM>",
      "position": 18
    },
    {
      "token": "rates",
      "start_offset": 184,
      "end_offset": 189,
      "type": "<ALPHANUM>",
      "position": 19
    },
    {
      "token": "ueurusd",
      "start_offset": 192,
      "end_offset": 199,
      "type": "<ALPHANUM>",
      "position": 20
    },
    {
      "token": "decimal",
      "start_offset": 201,
      "end_offset": 208,
      "type": "<ALPHANUM>",
      "position": 21
    },
    {
      "token": "1.061025",
      "start_offset": 209,
      "end_offset": 217,
      "type": "<NUM>",
      "position": 22
    },
    {
      "token": "ugbpusd",
      "start_offset": 220,
      "end_offset": 227,
      "type": "<ALPHANUM>",
      "position": 23
    },
    {
      "token": "decimal",
      "start_offset": 229,
      "end_offset": 236,
      "type": "<ALPHANUM>",
      "position": 24
    },
    {
      "token": "1.494125",
      "start_offset": 237,
      "end_offset": 245,
      "type": "<NUM>",
      "position": 25
    },
    {
      "token": "ueurgbp",
      "start_offset": 248,
      "end_offset": 255,
      "type": "<ALPHANUM>",
      "position": 26
    },
    {
      "token": "decimal",
      "start_offset": 257,
      "end_offset": 264,
      "type": "<ALPHANUM>",
      "position": 27
    },
    {
      "token": "0.710150",
      "start_offset": 265,
      "end_offset": 273,
      "type": "<NUM>",
      "position": 28
    },
    {
      "token": "event",
      "start_offset": 277,
      "end_offset": 282,
      "type": "<ALPHANUM>",
      "position": 29
    },
    {
      "token": "accountinstrumentsupdated",
      "start_offset": 284,
      "end_offset": 309,
      "type": "<ALPHANUM>",
      "position": 30
    },
    {
      "token": "minute",
      "start_offset": 311,
      "end_offset": 317,
      "type": "<ALPHANUM>",
      "position": 31
    },
    {
      "token": "1449384420",
      "start_offset": 319,
      "end_offset": 329,
      "type": "<NUM>",
      "position": 32
    }
  ]
}

短语“生产用户:无:用户名:无:日志:发布到总线”

 {
      "token": "production",
      "start_offset": 41,
      "end_offset": 51,
      "type": "<ALPHANUM>",
      "position": 7
    },
    {
      "token": "user:none:username:none:log",
      "start_offset": 52,
      "end_offset": 79,
      "type": "<ALPHANUM>",
      "position": 8
    },
    {
      "token": "publishing",
      "start_offset": 81,
      "end_offset": 91,
      "type": "<ALPHANUM>",
      "position": 9
    },
    {
      "token": "to",
      "start_offset": 92,
      "end_offset": 94,
      "type": "<ALPHANUM>",
      "position": 10
    },
    {
      "token": "bus",
      "start_offset": 95,
      "end_offset": 98,
      "type": "<ALPHANUM>",
      "position": 11
    }

所以如果你搜索“publishing to bus”,elasticsearch 会匹配上面三个标记并返回文档。

如果您搜索“None:LOG: publishing to bus”“None:LOG:”不完全匹配,因此它不会返回文档。

你可以试试“User:None:Username:None:LOG: publishing to bus”来得到结果。

【讨论】:

  • 解决方法是替换字符:对于字符_或者在索引中表示该字段是非分析字段。
【解决方案2】:

在 Kibana 中存在一些特殊字符为 : | 的问题和 -。当 kibana 发现这种字符时,他们保存在不同的部分,而不是同一个领域。因为很容易找到发布到总线或无或日志。解决方案是您必须向 kibana 表明该字段不会被分析。

【讨论】:

  • 这不是真正的答案。所有这些字符都存在于一个 kibana 字段中。将存储在 kibana 中的原始 json 添加到问题中以证明
  • 您将其视为一个简单的领域,但实际上 kibana 并不以这种方式处理该领域。尝试将该字段作为未分析字段传递。
  • 存储的内容和索引的内容是两个不同的东西。如前所述,您可以发布映射吗?
  • Toy to do a replace of : for _ 你会看到你可以进行搜索。
  • @Deckard27 因此,如果我将所有 ':' 替换为 '_' (在插入 kibana 之前,使用 mutate 或在我这边记录之前),那么一切都会按预期工作?
最近更新 更多