【问题标题】:Active Directory logs parsing using grok is slow使用 grok 解析 Active Directory 日志很慢
【发布时间】:2023-03-09 07:10:02
【问题描述】:

我是新手。 我正在尝试使用 grok 解析器解析 Microsoft active directory logs。 我正在使用java grok 库。

日志看起来像

<13> 10.200.3.7  10.20.211.15 07/04/2017 15:34:00 PM SERVER01 07/04/2017 15:34:00 PM  LogName=Security  SourceName=Microsoft Windows security auditing.  EventCode=4624  EventType=0  Type=Information  ComputerName=SERVER01.network.local  TaskCategory=Logon  OpCode=Info  RecordNumber=1809490942  Keywords=Audit Success  Message=An account was successfully logged on.      Subject:     Security ID:        S-1-0-0     Account Name:       User-330    Account Domain:     -       Logon ID:       0x0      Logon Type:            3      New Logon:       Security ID:        S-1-5-18    Account Name:       SERVER01$       Account Domain:     DOMAIN      Logon ID:       0x12393ab39     Logon GUID:     \{C893D0A2-6498-BBE3-560D-0A1088FA4D9E\}      Process Information:      Process ID:     0x0     Process Name:       -      Network Information:     Workstation Name:       Source Network Address: 1.68.4.213      Source Port:        57261      Detailed Authentication Information:     Logon Process:      Kerberos    Authentication Package: Kerberos    Transited Services: -       Package Name (NTLM only):   -       Key Length:     0      This event is generated when a logon session is created. It is generated on the computer that was accessed.      The subject fields indicate the account on the local system which requested the logon. This is most commonly a service such as the Server service, or a local process such as Winlogon.exe or Services.exe.      The logon type field indicates the kind of logon that occurred. The most common types are 2 (interactive) and 3 (network).      The New Logon fields indicate the account for whom the new logon was created, i.e. the account that was logged on.      The network fields indicate where a remote logon request originated. Workstation name is not always available and may be left blank in some cases.      The authentication information fields provide detailed information about this specific logon request.      - Logon GUID is a unique identifier that can be used to correlate this event with a KDC event.      - Transited services indicate which intermediate services have participated in this logon request.      - Package name\
<13> 10.200.3.7  10.20.211.15 07/04/2017 15:34:00 PM SERVER01 07/04/2017 15:34:00 PM  LogName=Security  SourceName=Microsoft Windows security auditing.  EventCode=4624  EventType=0  Type=Information  ComputerName=SERVER01.network.local  TaskCategory=Logon  OpCode=Info  RecordNumber=1809490942  Keywords=Audit Success  Message=An account was successfully logged on.      Subject:     Security ID:        S-1-0-0     Account Name:       User-331    Account Domain:     -       Logon ID:       0x0      Logon Type:            3      New Logon:       Security ID:        S-1-5-18    Account Name:       SERVER01$       Account Domain:     DOMAIN      Logon ID:       0x12393ab39     Logon GUID:     \{C893D0A2-6498-BBE3-560D-0A1088FA4D9E\}      Process Information:      Process ID:     0x0     Process Name:       -      Network Information:     Workstation Name:       Source Network Address: 1.68.4.214      Source Port:        57261      Detailed Authentication Information:     Logon Process:      Kerberos    Authentication Package: Kerberos    Transited Services: -       Package Name (NTLM only):   -       Key Length:     0      This event is generated when a logon session is created. It is generated on the computer that was accessed.      The subject fields indicate the account on the local system which requested the logon. This is most commonly a service such as the Server service, or a local process such as Winlogon.exe or Services.exe.      The logon type field indicates the kind of logon that occurred. The most common types are 2 (interactive) and 3 (network).      The New Logon fields indicate the account for whom the new logon was created, i.e. the account that was logged on.      The network fields indicate where a remote logon request originated. Workstation name is not always available and may be left blank in some cases.      The authentication information fields provide detailed information about this specific logon request.      - Logon GUID is a unique identifier that can be used to correlate this event with a KDC event.      - Transited services indicate which intermediate services have participated in this logon request.      - Package name\

我的grok模式是

\<%{USER:hField1}\> %{IPV4:hIp1}  %{IPV4:hIp2} %{DATESTAMP_12HOUR:hTime1;date;dd/MM/yyyy hh:mm:ss a} %{USER:hField2} %{DATESTAMP_12HOUR:hTime2;date;dd/MM/yyyy hh:mm:ss a}  LogName=%{USER:logname}%{SPACE}SourceName=%{GREEDYDATA:sourceName}%{SPACE}EventCode=%{GREEDYDATA:eventCode}%{SPACE}EventType=%{GREEDYDATA:eventType}%{SPACE}Type=%{GREEDYDATA:typeField}%{SPACE} ComputerName=%{GREEDYDATA:computerName}%{SPACE}TaskCategory=%{GREEDYDATA:taskCategory}%{SPACE}OpCode=%{GREEDYDATA:opCode}%{SPACE}RecordNumber=%{GREEDYDATA:recordNumber}%{SPACE}Keywords=%{GREEDYDATA:keywords}%{SPACE}Message=%{NON_DOT_DELIMITER:message}.%{SPACE}%{GREEDYDATA:jsonData}

问题在于,与我的自定义 java 解析器相比,它非常慢。 我的自定义 java 解析器需要 2.5 秒来解析 50K 记录,而使用 grok 模式解析相同数据需要 60 秒。

我的解析器有问题吗?

【问题讨论】:

  • 您是否使用相同的 RegEx 模式来搜索两者?
  • 我共享的示例日志包含两条记录,我正在根据每条记录应用 grok。

标签: java parsing active-directory grok


【解决方案1】:

与任何正则表达式一样,当您的正则表达式解释器不必猜测时,您将获得加速。因此,在您的 grok 模式周围使用锚点 ^(行首)和 $(行尾)时,您可能会获得不错的收益。

【讨论】:

    【解决方案2】:

    您的自定义 Java 解析器是否使用 Java 正则表达式?请查看在您的 grok 库 github 项目中打开的问题 here。似乎发生了一些变化。

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 2021-10-17
      • 2013-06-24
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 2020-02-19
      相关资源
      最近更新 更多