【问题标题】:Not able to drop event where grok filter does not match, logstash, elastic search无法删除 grok 过滤器不匹配的事件、logstash、弹性搜索
【发布时间】:2016-03-30 00:28:28
【问题描述】:

我正在尝试解析 tomcat 日志并将输出传递给弹性搜索。或多或少它运作良好。当我看到弹性搜索索引数据时,它包含大量匹配数据,其标签字段为_grokparsefailure。 这会导致大量重复的匹配数据。为了避免这种情况,如果标签包含_grokparsefailure,我会尝试删除事件。这个配置写在 grok 过滤器下面的 logstash.conf 文件中。弹性搜索的输出仍然包含索引文档,其中包含带有_grokparsefailure 的标签。 如果 grok 失败,我不希望该匹配项进入弹性搜索,因为它会导致弹性搜索中出现重复数据。

logstash.conf 文件是:

input {

  file {

    path => "/opt/elasticSearch/logstash-1.4.2/input.log"
        codec => multiline {
                pattern => "^\["
                negate => true
                what => previous
        }
        start_position => "end"

  }

}

filter {

        grok {

    match => [
"message", "^\[%{GREEDYDATA}\] %{GREEDYDATA} Searching hotels for country %{GREEDYDATA:country}, city %{GREEDYDATA:city}, checkin %{GREEDYDATA:checkin}, checkout %{GREEDYDATA:checkout}, roomstay %{GREEDYDATA:roomstay}, No. of hotels returned is %{NUMBER:hotelcount} ."
    ]

  }

 if "_grokparsefailure"  in [tags]{     

        drop { }

    }

}

output {

file {
   path => "/opt/elasticSearch/logstash-1.4.2/output.log"
 }

 elasticsearch {
                cluster => "elasticsearchdev"
  }

}

弹性搜索响应http://172.16.37.97:9200/logstash-2015.12.23/_search?pretty=true

鉴于以下输出包含三个文档,其中第一个包含 _source -> tags 字段中的 _grokparsefailure。

我不希望它出现在这个输出中。所以可能需要从logstash限制它,这样它就不会进入弹性搜索。

{
  "took" : 1,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "failed" : 0
  },
  "hits" : {
    "total" : 3,
    "max_score" : 1.0,
    "hits" : [

 {

      "_index" : "logstash-2015.12.23",
      "_type" : "logs",
      "_id" : "J6CoEhKaSE68llz5nEbQSQ",
      "_score" : 1.0,
      "_source":{"message":"[2015-12-23 12:08:40,124] ERROR http-80-5_@{AF3AF784EC08D112D5D6FC92C78B5161,127.0.0.1,1450852688060} com.mmt.hotels.web.controllers.search.HotelsSearchController - Searching hotels for country IN, city DEL, checkin 28-03-2016, checkout 29-03-2016, roomstay 1e0e, No. of hotels returned is 6677 .","@version":"1","@timestamp":"2015-12-23T14:17:03.436Z","host":"ggn-37-97","path":"/opt/elasticSearch/logstash-1.4.2/input.log","tags":["_grokparsefailure"]}

    },

 {

      "_index" : "logstash-2015.12.23",
      "_type" : "logs",
      "_id" : "2XMc6nmnQJ-Bi8vxigyG8Q",
      "_score" : 1.0,
      "_source":{"@timestamp":"2015-12-23T14:17:02.894Z","message":"[2015-12-23 12:08:40,124] ERROR http-80-5_@{AF3AF784EC08D112D5D6FC92C78B5161,127.0.0.1,1450852688060} com.mmt.hotels.web.controllers.search.HotelsSearchController - Searching hotels for country IN, city DEL, checkin 28-03-2016, checkout 29-03-2016, roomstay 1e0e, No. of hotels returned is 6677 .","@version":"1","host":"ggn-37-97","path":"/opt/elasticSearch/logstash-1.4.2/input.log","country":"IN","city":"DEL","checkin":"28-03-2016","checkout":"29-03-2016","roomstay":"1e0e","hotelcount":"6677"}

},

 {

      "_index" : "logstash-2015.12.23",
      "_type" : "logs",
      "_id" : "fKLqw1LJR1q9YDG2yudRDw",
      "_score" : 1.0,
      "_source":{"@timestamp":"2015-12-23T14:16:12.684Z","message":"[2015-12-23 12:08:40,124] ERROR http-80-5_@{AF3AF784EC08D112D5D6FC92C78B5161,127.0.0.1,1450852688060} com.mmt.hotels.web.controllers.search.HotelsSearchController - Searching hotels for country IN, city DEL, checkin 28-03-2016, checkout 29-03-2016, roomstay 1e0e, No. of hotels returned is 6677 .","@version":"1","host":"ggn-37-97","path":"/opt/elasticSearch/logstash-1.4.2/input.log","country":"IN","city":"DEL","checkin":"28-03-2016","checkout":"29-03-2016","roomstay":"1e0e","hotelcount":"6677"}

    } ]
  }
}

]

【问题讨论】:

    标签: elasticsearch logstash logstash-grok


    【解决方案1】:

    您可以尝试在output 部分测试_grokparsefailure,如下所示:

    output {
      if "_grokparsefailure" not in [tags] {
        file {
          path => "/opt/elasticSearch/logstash-1.4.2/output.log"
        }
    
        elasticsearch {
          cluster => "elasticsearchdev"
        }
      }
    }
    

    【讨论】:

    • 试过了,但它不起作用。仍然输出在标签中有 _grokparsefailure。
    • 您在尝试之前是否擦除了索引?也就是说,您确定您查看的是新文档而不是旧文档吗?
    • 是的,我做到了。使用此命令删除索引。 curl -XDELETE 'localhost:9200/logstash-2015.12.24'
    • 好的,您能否确保您没有查看过去索引中的文档,即logstash-2015-12-23logstash-2015-12-22
    • 文档仅适用于当天。我用一些虚拟文本修改了我的 input.log,以确保它获取最新的更改。这个虚拟文本仍在输出中。
    【解决方案2】:

    有时您可能有多个 grok 过滤器,其中一些可能会因某些事件而失败,但会通过休息,基于 _grokparsefailure 删除事件并不能解决问题。

    示例:

    input
    {
    some input
    }
    
    filter
    {
    grok1 {extract ip to my_ip1}
    
    grok2 {extract ip to my_ip2}
    
    grok3 {extract ip to my_ip3}
    
    
    }
    
    output
    {
      if "_grokparsefailure" not in [tags] { <-- This will not write to output if any single grok fails.
      some output
    }
    }
    

    我这里的解决方案是根据一些变量过滤掉。还有其他更好的方法吗??? 示例:

    if "10." in ["ip1"] or "10." in ["ip2"] or "10." in ["ip3"]
    {
     drop{}
    }
    

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 2017-07-18
      • 1970-01-01
      相关资源
      最近更新 更多