【问题标题】:Add Extra Strings Based on count of fields- Sed/Awk根据字段数量添加额外的字符串 - Sed/Awk
【发布时间】:2021-08-16 14:21:39
【问题描述】:

我在文本文件中有以下格式的数据。

 null,"ABC:MNO"
"hjgy","ABC:PQR"
"mn","qwe","ABC:WER"
"mn","qwe","mno","ABC:WER"

所有行都应该有 3 个字段,如第 3 行。我想要以下格式的数据。

"","","","ABC:MNO"
"hjgy","","","ABC:PQR"
"mn","qwe","","ABC:WER"
"mn","qwe","mno","ABC:WER" 

如果该行以 null 开头,则 null 应替换为 "","","",

如果只有 2 个字段,则应在第一个字符串之后添加 "","",

如果有 3 个字段,则应在第二个字符串之后添加 "",

如果有 4 个字段,则什么也不做。

我可以使用sed 's/null/\"\",\"\",\"\"/' test.txt处理第一种情况

但我不知道如何处理接下来的 2 个场景。

问候。

【问题讨论】:

  • 字段内容可以在双引号内有,吗?例如:"abc,xyz"?
  • 不.. 理想情况下它不应该有。但不知道极端情况。
  • 不,不会有。
  • 当你说All rows should have 3 fields like row 3 - 你的意思是All rows should have 4 fields like row 4 不是吗?

标签: shell unix awk sed


【解决方案1】:

perl:

$ perl -pe 's/^null,/"","","",/; s/.*,\K/q("",) x (3 - tr|,||)/e' ip.txt
"","","","ABC:MNO"
"hjgy","","","ABC:PQR"
"mn","qwe","","ABC:WER"
"mn","qwe","mno","ABC:WER"
  • s/^null,/"","","",/先照顾null字段
  • .*,\K 匹配到最后一行 ,
    • \K 有助于避免将匹配部分放回去
    • 3 - tr|,|| 会告诉你缺少多少字段(tr 返回值是, 的出现次数)
    • q("",) 这里q() 用于表示单引号字符串,因此不需要转义"
    • x 是字符串复制操作符
    • e 标志允许您在替换部分使用 Perl 代码

如果以null, 开头的行总是有两个字段,那么您也可以使用:

perl -pe 's/.*,\K/q("",) x (3 - tr|,||)/e; s/^null,/"",/'

awk类似的逻辑:

awk -v q='"",' 'BEGIN{FS=OFS=","} {sub(/^null,/, q q q);
                c=4-NF; while (c--) $NF = q $NF} 1'

【讨论】:

    【解决方案2】:

    仅使用您展示的示例,请尝试以下操作。

    awk '
    BEGIN{
      FS=OFS=","
    }
    {
      sub(/^null/,"\"\",\"\",\"\"")
    }
    NF==2{
      $1=$1",\"\",\"\""
    }
    NF==3{
      $2=$2",\"\""
    }
    1' Input_file
    

    "设为变量,也可以尝试以下操作:

    awk -v s1="\"\"" '
    BEGIN{
      FS=OFS=","
    }
    {
      sub(/^null/,s1 "," s1","s1)
    }
    NF==2{
      $1=$1"," s1 "," s1
    }
    NF==3{
      $2=$2"," s1
    }
    1'  Input_file
    

    说明:为上述添加详细说明。

    awk '                  ##Starting awk program from here.
    BEGIN{                 ##Starting BEGIN section of this program from here.
      FS=OFS=","           ##Setting FS and OFS to comma here.
    }
    {
      sub(/^null/,"\"\",\"\",\"\"")  ##Substituting starting with space null to "","","", in current line.
    }
    NF==2{                 ##If number of fields are 2 then do following.
      $1=$1",\"\",\"\""    ##Adding ,"","" after 1st field value here.
    }
    NF==3{                 ##If number of fields are 3 here then do following.
      $2=$2",\"\""         ##Adding ,"" after 2nd field value here.
    }
    1                      ##Printing current line here.
    ' Input_file           ##Mentioning Input_file name here.
    

    【讨论】:

    • Null 没有被替换 .. null,"","","ABC:MNO"
    • @user2854333,在您显示的示例中,null 有空间,然后让我现在更改它。
    • @user2854333,请立即尝试,我现在删除了 null 之前的空格,请检查一次,让我知道情况如何,谢谢。
    • 抱歉,可能是 type 。 null .. 之前没有空格 .. 删除了 ^ 并且效果很好......你真是个天才
    • @user2854333,不是问题,我已经处理好了,请现在检查一下,我们这里应该很好。
    【解决方案3】:

    使用 awk 的解决方案:

    awk -F "," 'BEGIN{ OFS=FS }
        { gsub(/^ /,"",$1)
        if($1=="null") print "\x22\x22","\x22\x22","\x22\x22", $2
        else if(NF==2) print $1,"\x22\x22","\x22\x22",$2
        else if(NF==3) print $1,$2,"\x22\x22",$3
        else print $0 }' input
    

    【讨论】:

      【解决方案4】:

      这可能对你有用(GNU sed):

      sed 's/^\s*null,/"",/;:a;ta;s/,/&/3;t;s/.*,/&"",/;ta' file
      

      如果该行以null 开头,则将该字段替换为一个空字段,即"",

      通过使用ta 回到:a 来重置替代成功标志(只有当第一个字段是null 并且已被替代时才会出现这种情况)。

      如果存在第三个字段分隔符,则全部完成。

      否则,在最后一个字段分隔符之前插入一个空字段并重复。

      【讨论】:

        猜你喜欢
        • 2023-03-18
        • 1970-01-01
        • 1970-01-01
        • 1970-01-01
        • 1970-01-01
        • 1970-01-01
        • 2023-03-04
        • 2021-07-23
        • 1970-01-01
        相关资源
        最近更新 更多