【问题标题】:awk spool specific values to CSV file (load to oracle) without quotes|awk|unixawk 将特定值假脱机到 CSV 文件(加载到 oracle),不带引号|awk|unix
【发布时间】:2017-09-06 10:44:47
【问题描述】:

我正在尝试从如下日志文件中提取特定值:

Table "OWNER123"."MYTABLE":

  3785568 Rows successfully loaded.
  0 Rows not loaded due to data errors.
  0 Rows not loaded because all WHEN clauses were failed.
  0 Rows not loaded because all fields were null.

Bind array size not used in direct path.
Column array  rows :    5000
Stream buffer bytes:  256000
Read   buffer bytes: 1048576

Total logical records skipped:          0
Total logical records read:       3785568
Total logical records rejected:         0
Total logical records discarded:        0
Total stream buffers loaded by SQL*Loader main thread:      878
Total stream buffers loaded by SQL*Loader load thread:      796

Run began on Fri Sep 01 04:00:26 2017
Run ended on Fri Sep 01 04:04:45 2017

Elapsed time was:     00:04:19.24
CPU time was:         00:00:08.56

我想要检索的是将输出假脱机到具有以下格式(无引号)的特定 CSV 文件:

MYTABLE,3785568,Sep 01 04:00:26 2017, Sep 01 04:04:45 2017

这怎么可能用一个 awk 命令提取?

任何帮助将不胜感激:)

提前谢谢你!!

【问题讨论】:

  • 如果您正在寻找一种使用 AWK 解析该日志文件的方法,那么您最好在 unix.stackexchange.com 发帖。

标签: bash oracle csv unix awk


【解决方案1】:

Oneliner

awk -v OFS=, '/^Table/{gsub(/.*\.|[":]/,""); table=$0;next}/Rows successfully loaded/{rows = $1;next}/Run began on/{ sub(/Run began on /,""); start = $0 ;next }/Run ended on/{sub(/Run ended on /,"");print table, rows, start, $0}' logfile

如果您想在 Solaris/SunOS 系统上尝试此操作,请将脚本开头的 awk 更改为 /usr/xpg4/bin/awk/usr/xpg6/bin/awknawk

脚本

[akshay@localhost tmp]$ cat parse.awk
/^Table/{
    gsub(/.*\.|[":]/,"");
    table=$0
    next
}
/Rows successfully loaded/{
        rows = $1
        next
}
/Run began on/{ 
        sub(/Run began on /,""); 
        start = $0 
        next 
}
/Run ended on/{
        sub(/Run ended on /,"");    
        print table, rows, start, $0
        exit
}

执行与输出

[akshay@localhost tmp]$ awk -v OFS=, -f parse.awk logfile 
MYTABLE,3785568,Fri Sep 01 04:00:26 2017,Fri Sep 01 04:04:45 2017

输入

[akshay@localhost tmp]$ cat logfile 
Table "OWNER123"."MYTABLE":

  3785568 Rows successfully loaded.
  0 Rows not loaded due to data errors.
  0 Rows not loaded because all WHEN clauses were failed.
  0 Rows not loaded because all fields were null.

Bind array size not used in direct path.
Column array  rows :    5000
Stream buffer bytes:  256000
Read   buffer bytes: 1048576

Total logical records skipped:          0
Total logical records read:       3785568
Total logical records rejected:         0
Total logical records discarded:        0
Total stream buffers loaded by SQL*Loader main thread:      878
Total stream buffers loaded by SQL*Loader load thread:      796

Run began on Fri Sep 01 04:00:26 2017
Run ended on Fri Sep 01 04:04:45 2017

Elapsed time was:     00:04:19.24
CPU time was:         00:00:08.56

【讨论】:

  • 你好阿克谢!谢谢,好一个。但这可以在单行 awk 命令中实现吗??
  • @tln_jupiter : 按要求添加了单列
  • @tln_jupiter 使用 xpg awk 之一,而不是 nawk。与 xpg awks 相比,nawk 更老,更远离 POSIX 合规性。甚至在 Solaris 上可用的 awk 上下文中提出 nawk 也没有意义,因为您有可用的 xpg 并且它们更好。
  • @EdMorton 感谢提供有用的信息,就我个人而言,我在 sun/solaris 方面没有太多经验,过去在迁移我们使用它的服务器期间,我主要喜欢在 gawk 上工作。跨度>
  • 如果可能的话,你肯定想得到 gawk,但如果你不能,那么使用 xpg awk。忘掉 nawk 吧,把 /bin/awk 当作是狂暴的狼人拉屎熔岩一样对待……
【解决方案2】:

给你:

'BEGIN{ ORS = ","}/^Table/{ gsub(/"||:/,"",$2); split($2, a, "."); print a[2] }/Rows/{ if (++n==1){ print $1 } }/began/ || /ended/{ print $5, $6, $7, $8 }' yourfile.txt | sed 's/,$//'

输出:

MYTABLE,3785568,Sep 01 04:00:26 2017,Sep 01 04:04:45 2017

我想你现在对如何修改这些 awk 命令有了一些想法。

【讨论】:

  • 非常感谢。虽然它没有按预期工作。例如我执行: awk 'BEGIN{ ORS = ","}/^Table/{ gsub(/"||:/,"",$2); split($2, a, "."); print a[2 ] }/Rows/{ if (++n==1){ print $1 } }/began/ || /ended/{ print $5, $6, $7, $8 }' myfile.log | sed 's/,$/ /' > extract_INVOICE.csv 和 .csv 的输出是:MYTABLE,,MYTABLE:,3785568,Sep 01 05:30:21 2017,Sep 01 05:45:26 2017 你能帮忙吗?
  • 嗯,从你给定的输入来看,它对我有用。奇怪的是它正在打印 Mytable,,Mytable: 你在那里使用的 unput 有什么不同吗?
  • 嗯,很奇怪。您能否检查以下回复并发现不同之处? :)
  • 使用回复中的命令为我工作。但我猜你得到了答案:)
猜你喜欢
  • 2016-12-27
  • 1970-01-01
  • 2014-08-27
  • 1970-01-01
  • 1970-01-01
  • 2018-02-16
  • 1970-01-01
  • 1970-01-01
  • 2022-07-07
相关资源
最近更新 更多