【发布时间】:2011-08-18 00:13:06
【问题描述】:
"a004-1b","North","at006754"
"a004-1c","south","atytgh0"
"a004-1d","east","atrthh"
"a010-1a","midwest","atyu"
"a010-1b","south","rfg67"
我想打印第一列和第二列没有任何多余的字符我想消除所有(“”和第三列)提前谢谢
【问题讨论】:
"a004-1b","North","at006754"
"a004-1c","south","atytgh0"
"a004-1d","east","atrthh"
"a010-1a","midwest","atyu"
"a010-1b","south","rfg67"
我想打印第一列和第二列没有任何多余的字符我想消除所有(“”和第三列)提前谢谢
【问题讨论】:
awk -F'^"|","|"$' '{print $2,$3}' ./infile.csv
上面的脚本甚至可以处理嵌入了双引号或逗号的字段。唯一的缺点(如果你可以这么称呼的话)是第一个字段从$2开始
$ awk -F'^"|","|"$' '{print $2,$3}' ./infile.csv
a004-1b North
a004-1c south
a010-1a midwest
a010-1b south
【讨论】:
awk 和 sed 用法都在这里 =)
你需要 GNU Awk 4 才能工作:
$ gawk -vFPAT='[^",]+' '{print $1,$2}'
我喜欢这个新的“字段模式”功能。这是我的新锤子,一切都是钉子。在http://www.gnu.org/software/gawk/manual/html_node/Splitting-By-Content.html阅读它
(这样写它不考虑嵌入的逗号或引号,因为问题暗示这不是必需的。)
【讨论】:
如果您为此使用awk,为什么要在上面加上 Perl 标记?
在 Perl 中:
#!/usr/bin/env perl
use strict;
use warnings;
use Data::Dumper;
# Make Data::Dumper pretty
$Data::Dumper::Sortkeys = 1;
$Data::Dumper::Indent = 1;
# Set maximum depth for Data::Dumper, zero means unlimited
local $Data::Dumper::Maxdepth = 0;
use Text::CSV;
my $csv = Text::CSV->new();
while( my $row = $csv->getline( \*DATA )){
print 'row: ', Dumper $row;
}
__DATA__
"a004-1b","North","at006754"
"a004-1c","south","atytgh0""a004-1d","east","atrthh"
"a010-1a","midwest","atyu"
"a010-1b","south","rfg67"
【讨论】:
awk -F'\"|\,' '{print $2,$5}' sample
【讨论】:
awk -F'"|,' 就足够了。但与其他一些答案一样,这不适用于嵌入引号或逗号的字段。相反,请使用 awk -F'^"|","|"$ 来处理所有极端情况。
不处理嵌入的双引号:
sed -e 's/^"\([^"]*\)","\([^"]*\)".*/\1 \2/'
处理它们:
sed -n -e 's/^"//;s/"$//;s/","/ /;s/","/\n/;P'
上述方法甚至适用于 1 或 2 字段输入。
【讨论】:
如果你想要它“纯”awk 或 sed,这不符合要求,但除此之外它可以工作:
awk -F, '{print $1 " " $2}' | tr -d '"'
【讨论】:
cut -d "," -f 1,2 --output-delimiter=" "| tr -d '"' 会做同样的事情,所以这里不需要 awk。