【问题标题】:awk: Preserve original field separators when using multiple separatorsawk:使用多个分隔符时保留原始字段分隔符
【发布时间】:2017-12-02 03:57:49
【问题描述】:

我正在尝试重新编号 myfile1.txt 中的 line_id 字段,其中的每一行都有多个分隔符。最终目标是从这些数据中获取一个 Python 字典列表。所以每一行都会变成一个字典,所以分隔符“:”和“,”对我来说真的很重要。

这是来自 myfile.txt 的 sn-p:

"line_id":57,"name":"Test File","seq_number":26,"user":"user1","text_entry":"Entered the room"
"line_id":58,"name":"Test File","seq_number":26,"user":"user1","text_entry":"Left the room"
"line_id":59,"name":"Test File","seq_number":26,"user":"user1","text_entry":"Quit the group"
"line_id":60,"name":"Test File","seq_number":1,"user":"user2","text_entry":"Late to the party"
"line_id":61,"name":"Test File","seq_number":1,"user":"user2","text_entry":"Not responding"

以下 awk 语句运行良好,尽管我丢失了所有分隔符。它们被替换为空格。

awk -F [:,] '$2=$2-56' myfile1.json >> myfile2.txt

结果是:

"line_id" 1 "name" "Test File" "seq_number":26 "user" "user1" "text_entry" "Entered the room"
"line_id" 2 "name" "Test File" "seq_number":26 "user" "user1" "text_entry" "Left the room"
"line_id" 3 "name" "Test File" "seq_number":26 "user" "user1" "text_entry" "Quit the group"
"line_id" 4 "name" "Test File" "seq_number":1 "user" "user2" "text_entry" "Late to the party"
"line_id" 5 "name" "Test File" "seq_number":1 "user" "user2" "text_entry" "Not responding"

现在我留下了在适当的地方取回 : 和 , 的问题。我探索了 sed 但没有找到一种简单的方法来对第二个字段进行减法。

我已经通过this link 这对我的要求没有太大帮助。 请指教。

【问题讨论】:

    标签: python bash awk sed


    【解决方案1】:
    1. 使用逗号作为输入和输出字段分隔符
    2. 使用awk 中的split 函数拆分冒号上的第一列
    3. 从拆分数组的第二个元素中减去 56 后重新填充 $1

    代码:

    awk 'BEGIN{FS=OFS=","} {split($1, a, /:/); $1 = a[1] ":" a[2] - 56} 1' file
    
    "line_id":1,"name":"Test File","seq_number":26,"user":"user1","text_entry":"Entered the room"
    "line_id":2,"name":"Test File","seq_number":26,"user":"user1","text_entry":"Left the room"
    "line_id":3,"name":"Test File","seq_number":26,"user":"user1","text_entry":"Quit the group"
    "line_id":4,"name":"Test File","seq_number":1,"user":"user2","text_entry":"Late to the party"
    "line_id":5,"name":"Test File","seq_number":1,"user":"user2","text_entry":"Not responding"
    

    【讨论】:

    • 非常感谢。这对我来说效果很好。为了便于理解,我们使用了变量 a,末尾的“1”表示第一个字段。
    • 1 最后只打印完整的记录。
    猜你喜欢
    • 1970-01-01
    • 2013-02-15
    • 1970-01-01
    • 1970-01-01
    • 2014-07-24
    • 1970-01-01
    • 2014-10-14
    • 1970-01-01
    • 1970-01-01
    相关资源
    最近更新 更多