【问题标题】:match column and delete the Duplicates in Shell匹配列并删除 Shell 中的重复项
【发布时间】:2022-01-18 03:29:41
【问题描述】:

输入文件

Failed,2021-12-14 05:47 EST,On-Demand Backup,abc,/clients/FORD_1130PM_EST_Windows2008,Windows File System
Completed,2021-12-14 05:47 EST,On-Demand Backup,def,/clients/FORD_1130PM_EST_Windows2008,Windows File System
Failed,2021-12-13 19:33 EST,Scheduled Backup,def,/clients/FORD_730PM_EST_Windows2008,Windows File System  
Failed,2021-12-14 00:09 EST,Scheduled Backup,abc,/clients/FORD_1130PM_EST_Windows2008,Windows File System
Failed,2021-12-14 00:09 EST,Scheduled Backup,ghi,/clients/FORD_1130PM_EST_Windows2008,Windows File System

预期输出

Failed,2021-12-14 00:09 EST,Scheduled Backup,ghi,/clients/FORD_1130PM_EST_Windows2008,Windows File System

我只想要那些永远不会成功并且没有为他们运行按需备份的客户端。

我试过的代码

awk -F ',' '
   $1~/Failed/  { fail[$4]=$0 }
  $1~/Completed/ {delete fail[$4]}
 $3 ~ /Demand/ {delete fail[$4]}
END {for (i in fail) print fail[i]}     
 ' test

【问题讨论】:

    标签: arrays bash shell awk multiple-columns


    【解决方案1】:

    这是一个 ruby​​,它将处理多个条目(如果有)和 csv 怪癖,例如嵌入式逗号:

    ruby -r csv -e '
    BEGIN{hsh = Hash.new {|hash,key| hash[key] = []}
          data = Hash.new {|hash,key| hash[key] = []}
    }
    CSV.parse($<.read).each{ |r|    hsh[r[3]] << r[0]; hsh[r[3]] << r[2]
                                    data[r[3]] << r.to_csv
                            }
    END{hsh.each{|k,v| s=v.join("\t")
        puts data[k].join() if !s[/Completed|Demand/] }
    }' file
    

    打印:

    Failed,2021-12-14 00:09 EST,Scheduled Backup,ghi,/clients/FORD_1130PM_EST_Windows2008,Windows File System
    

    【讨论】:

      【解决方案2】:

      使用您展示的示例,请尝试关注awk 程序。在 Input_file 的单遍中。这将只打印那些失败的值,并且根据显示的示例,它们的值中永远不会有任何按需值。

      awk '
      BEGIN         { FS=OFS=","  }
      $1=="Failed"  { arr1[$4]=$0 }
      $3~/On-Demand/{ arr2[$4]    }
      END{
        for(key in arr1){
          if(!(key in arr2)){
            print arr1[key]
          }
        }
      }
      ' Input_file
      

      说明:为上述添加详细说明。

      awk '                           ##Starting awk program from here.
      BEGIN         { FS=OFS=","  }   ##Starting BEGIN section and setting FS and OFS to , here.
      $1=="Failed"  { arr1[$4]=$0 }   ##Checking if 1st field is Failed then create arr1 with 4th field as an index and value of whole line.
      $3~/On-Demand/{ arr2[$4]    }   ##Checking if 3rd field is On-Demand then create arr2 array with index of 4th field.
      END{                            ##Starting END block of this program from here.
        for(key in arr1){             ##Traversing through arr1 here.
          if(!(key in arr2)){         ##Checking condition if key is NOT present in arr2 then do following.
            print arr1[key]           ##Printing arr1 value with index of key here.
          }
        }
      }
      ' Input_file                    ##Mentioning Input_file here.
      

      【讨论】:

        【解决方案3】:

        你可以使用这个awk命令:

        awk -F, 'NR==FNR {if ($1~/Failed/) fail[$4] = $0; next}
        $1 ~ /Completed/ || $3 ~ /Demand/ {delete fail[$4]}
        END {for (i in fail) print fail[i]}' file file
        
        Failed,2021-12-14 00:09 EST,Scheduled Backup,ghi,/clients/FORD_1130PM_EST_Windows2008,Windows File System
        

        【讨论】:

          猜你喜欢
          • 1970-01-01
          • 2023-02-13
          • 2016-10-27
          • 2017-05-20
          • 1970-01-01
          • 1970-01-01
          • 2018-08-03
          • 2010-11-12
          • 1970-01-01
          相关资源
          最近更新 更多