【问题标题】:How to group by column values into Row and column header and then sum the columns dynamically [duplicate]如何按列值分组到行和列标题,然后动态汇总列[重复]
【发布时间】:2016-12-26 13:04:31
【问题描述】:

下面是我的输入输出.txt文件。

我想按StatusDateMethod 的数据分组。 然后根据StatusDateMethod 求和。

Input.txt

No,Date,MethodStatus,Key,StatusDate,Hit,CallType,Method,LastMethodType
112,12/15/16,Suceess,Geo,12/15/16,1,Static,GET,12/15/16
113,12/18/16,Suceess,Geo,12/18/16,1,Static,GET,12/18/16
114,12/19/16,AUTHORIZED,Geo,12/19/16,1,Static,GET,12/19/16
115,12/19/16,AUTHORIZED,Geo,12/19/16,1,Static,GET,12/19/16
116,12/19/16,Suceess,Geo,12/19/16,1,Static,PUT,12/19/16
117,12/19/16,Suceess,Geo,12/19/16,1,Static,PUT,12/19/16
118,12/19/16,Waiting,Geo,12/19/16,1,Static,GET,12/19/16
119,12/19/16,AUTHORIZED,Geo,12/19/16,1,Static,GET,12/19/16
120,12/17/16,Suceess,Geo,12/17/16,1,Static,GET,12/17/16
121,12/17/16,Suceess,Geo,12/17/16,1,Static,GET,12/17/16
130,12/16/16,Suceess,Geo,12/16/16,1,Static,GET,12/16/16

Out.txt

StatusDate,12/15/16,12/16/16,12/17/16,12/17/16,12/18/16,12/19/16,12/19/16,12/19/16,12/19/16,12/19/16,12/19/16,Grand Total
GET,1,1,1,1,1,1,1,1,1,,,9
PUT,,,,,,,,,,1,1,2
Grand Total,1,1,1,1,1,1,1,1,1,1,1,11

我正在使用awk 并通过awk -F, '{if($8=="GET") print }' 拆分数据,然后计算总和值。 由于文件很大,所以会有延迟。

是否可以一步完成所有事情?那么文件操作会减少吗?

【问题讨论】:

    标签: java linux shell unix awk


    【解决方案1】:

    您可以像这样使用 GNU awk 脚本:

    script.awk

    BEGIN { PROCINFO["sorted_in"] = "@ind_str_asc" }
    
    function remember( theDate, mem) {
        mem[   theDate] +=1
        # in Totals the column sum is stored for each possible date (i.e the columns)
        Totals[theDate] += 1
    }
    
    # with header 0 or 1 the first line in output is differentiated
    # OFS is used, so it is possible to use a commandline option like 
    # -v OFS='\t' or  -v OFS=','
    function printMem( mem, name, header ) {
        printf("%s%s",name,OFS)
        sum=0
        for( k in Totals ) { 
            if( header) 
                printf("%s%s", k, OFS )
            else { 
                printf("%s%s", mem[k], OFS )
                sum += mem[k]
            }
        }
        if(!header) 
            printf("%s", sum )
        else 
            printf("Grand Total")
        print ""
    }
    
    # different methods are stored in different arrays
    $8 == "GET" { remember( $2, get ) }
    $8 == "PUT" { remember( $2, put ) }
    
    END { # print the stored values
          # the first line header
          printMem( Totals , "StatusDate", 1)
          printMem( get    , "GET", 0)
          printMem( put    , "PUT", 0)
          # the summary line
          printMem( Totals , "Grand Total", 0)
        }
    

    像这样运行脚本:awk -F, -v OFS=',' script.awk Input.txt

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 2019-01-03
      • 1970-01-01
      • 1970-01-01
      • 2020-01-16
      • 1970-01-01
      • 1970-01-01
      相关资源
      最近更新 更多