【问题标题】:Remove Duplicate Yara Rules with PowerShell Regular Expressions使用 PowerShell 正则表达式删除重复的 Yara 规则
【发布时间】:2014-11-10 17:58:02
【问题描述】:

Yara 规则用于通过将正则表达式应用于文件以查找二进制文件中的特定模式来检测恶意软件。我将所有 yara 规则保存在一个文本文件中。当我得到新规则时,我只是将它们粘贴到文本文件的末尾。我正在尝试纠正一个 PowerShell 2.0 脚本,该脚本将解析我的 Yara 规则并识别和删除任何重复的条目。

这是 yara 规则的格式:

规则 [名称] { [内容] }

这是一个示例规则:

rule CrowdStrike_CSIT_14004_02 : loader backdoor bouncer  {
meta:
    description = "Deep Panda Compiled ASP.NET <http://ASP.NET> Webshell"
    last_modified = "2014-04-25"
    version = "1.0"
    report = "CSIT-14004"
    in_the_wild = true
    copyright = "CrowdStrike, Inc"
    actor = "DEEP PANDA"
   strings:
    $cookie = "zWiz\x00" wide
    $cp = "es-DN" wide
    $enum_fs1 = "File system: {0}" wide
    $enum_fs2 = "Available: {0} bytes" wide
    $enum_fs3 = "Total space: {0} bytes" wide
    $enum_fs4 = "Total size: {0} bytes" wide
   condition:
    ($cookie and $cp) or all of ($enum*)
}

我最终想要做的是根据规则名称删除所有重复项。但是,如果规则名称相同但内容不同,我希望可以选择要删除的规则。

为了实现这一点,我计划创建一个关联数组,其中规则的名称是键,规则的内容是值。我想用正则表达式解析所有规则并将它们添加到关联数组中,如果规则(键)已经在数组中,那么我想跳过规则(如果内容相同)或显示两个规则并选择保留哪一个(如果内容不同)。

通过所有规则后,关联数组将被写入文件并消除所有重复项。

更新:现在可以使用了。这是脚本:

# Display proper usage and exit if no file is given
if ($args.Length -ne 1) { 
    Write-Host "`nUsage: .\yara-dedupe.ps1 [full-path-to-yara-rules]"
    exit    
}

# Display info and warning
Write-Host "`nNOTE: Use full path of rule file`n" 
$y = Read-Host "This script is EXPERIMENTAL and will modify $args. Backing up this file is recommended. If you still want to continue, enter (y)"
# Exit if y is not entered
if ($y -ne 'y') { exit}

# File path is passed on from the command line
$FilePath = $args

# This reads in the entire file as one string for multi-line matching
$File = [io.file]::ReadAllText($FilePath)
# Regular expression to separate the rule from the name
$Pattern = "(?smi)rule(.*?)\{(.*?condition:.*?)\}\r"
# All matching rules parsed according to the regular expression
$ParsedRules = $File | Select-String $Pattern -AllMatches

# A hash table (associative array) to store all rules
$Rules = @{}

# Add each non-duplicated rule to the hash table
$ParsedRules.Matches | Foreach { 

    # Extract rule name
    $Rule = $_.Groups[1].Value.Trim()
    #Extract rule content
    $Content = $_.Groups[2].Value.Trim()

    # Check if rule is already in the hash table
    if ($Rules.ContainsKey($Rule)) {

        Write-Host "Rule Exists: $Rule"

        # If it is, check if the content is identical and skip duplicate if it is
        if ($Rules.$Rule -eq $Content) { Write-Host "Skipping duplicate..." }

        # If it is not, then choose which one to accept
        else {
            # Display current rule content
            Write-Host "`nCurrent Rule Content[1]:"
            $Rules.$Rule
            # Display new rule content
            Write-Host "`nNew Rule Content[2]: $Content`n`n"
            # Ask user which rule content to keep
            $Choice = Read-Host 'Enter 1 to keep existing rule content, 2 to overwrite rule content with new rule content'
            # If choice was 1, continue to next rule
            if ($Choice -eq "1") { Write-Host "`nKeeping original content`n" }
            # Otherwise overwrite the existing rule content with the new rule content
            else { 
                $Rules.Set_Item($Rule,$Content)
                Write-Host "`nRule updated!`n"
            }
         }

    # Add the rule if it is not in the hash table
    } else { 
        $Rules.$Rule = $Content
        Write-Host "Rule Added: $Rule"
    } 
}

# Erase current file
Clear-Content $FilePath

# Output the hash table to file
$Rules.GetEnumerator() | ForEach-Object { Add-Content $FilePath "rule $($_.Key) {`n $($_.Value) `n}" }
Write-Host "De-duplication complete. New rules located at $FilePath"

【问题讨论】:

  • 所以rule .. { .. } 是您搜索的文字?
  • 我会查看 Powershell 的文档以了解如何索引 $matches 变量。类似于常规索引结构或$Matches{1} 之类的东西。

标签: regex parsing powershell


【解决方案1】:

您可以这样做,使用 MULTILINE 和 DOTALL 模式。
这很难,因为{} 可以出现在规则中,但如果你坚持使用
一组分隔符约束,它应该可以正常工作。

捕获 grp 1 是名称,grp 2 是规则体,还会修剪空白。

 #  (?s)^rule\s+([^{}]+?)\s*\{\s*(.+?)\s*\}$

 (?s)                     # Dot-All modifier (put '(?sm)' here) if your engine supports it ..
                          # otherwise, put them in the flags option of the regex object.  

 ^                        # Open Delimiter = BOL + 'rule' + name + '{'
 rule \s+ 
 ( [^{}]+? )              # (1), Rule name
 \s* 
 \{                       # '{'
 \s* 
 ( .+? )                  # (2), the rule, ungreedy
 \s* 
 \}                       # '}'
 $                        # Close Delimiter = '}' + EOL                     

输出:

 **  Grp 1 -  ( pos 5 , len 51 ) 
CrowdStrike_CSIT_14004_02 : loader backdoor bouncer  
 **  Grp 2 -  ( pos 61 , len 552 ) 
meta:
    description = "Deep Panda Compiled ASP.NET <http://ASP.NET> Webshell"
    last_modified = "2014-04-25"
    version = "1.0"
    report = "CSIT-14004"
    in_the_wild = true
    copyright = "CrowdStrike, Inc"
    actor = "DEEP PANDA"
   strings:
    $cookie = "zWiz\x00" wide
    $cp = "es-DN" wide
    $enum_fs1 = "File system: {0}" wide
    $enum_fs2 = "Available: {0} bytes" wide
    $enum_fs3 = "Total space: {0} bytes" wide
    $enum_fs4 = "Total size: {0} bytes" wide
   condition:
    ($cookie and $cp) or all of ($enum*)  

【讨论】:

    猜你喜欢
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2013-01-08
    • 2011-06-02
    • 1970-01-01
    • 2019-02-16
    • 1970-01-01
    相关资源
    最近更新 更多