【发布时间】:2019-05-17 21:27:51
【问题描述】:
我正在尝试清理一些没有转义的 csv 文件。
我没有 perl 经验,但我从 Text::CSV_XS 的示例中拼凑了几行代码,我设法得到了一个工作脚本,除了非转义换行符。
https://gist.github.com/samvdb/761d12cb6e0275105a689ce25765496d
#!/usr/bin/perl
# This script can be used as a base to parse unreliable CSV streams
# Modify to your own needs
#
# (m)'08 [23 Apr 2008] Copyright H.M.Brand 2008-2018
use strict;
use warnings;
sub usage {
my $err = shift and select STDERR;
print <<"EOH";
usage: $0 [-o file] [-s S] [file]
-o F --out=F output to file F (default STDOUT)
-s S --sep=S set input separator to S (default ; , TAB or |)
EOH
exit $err;
} # usage
use Getopt::Long qw(:config bundling);
GetOptions (
"help|?" => sub { usage (0); },
"s|sep=s" => \my $in_sep,
"o|out=s" => \my $opt_o,
) or usage (1);
use Text::CSV_XS qw( csv );
my $io = shift || \*DATA;
my $eol = "\n";
binmode STDOUT, ":encoding(utf-8)";
my @hdr;
my @opt_i = (
in => $io,
binary => 1,
blank_is_undef => 1,
allow_loose_quotes => 1,
allow_loose_escapes => 1,
sep => ";",
encoding => "utf16le",
);
my @opt_o = (out => \*STDOUT, eol => $eol, sep => ",", quo => '"', always_quote => 1,);
push @opt_i,
bom => 1,
sep_set => [ $in_sep ],
keep_headers => \@hdr;
push @opt_o,
headers => \@hdr;
csv (in => csv (@opt_i), @opt_o);
__END__
a;b;c;d;e;f
"test"and also newline\nhere or something";2;3;4;5;6
"this happens also! "\n here or something";2;3;4;5;6
2;3;4;5;6;7
3;4;5;6;7;8
4;5;6;7;8;9
示例输入:
a;b;c;d;e;f
"test"and also newline\nhere or something";2;3;4;5;6
"this happens also! "\n here or something";2;3;4;5;6
2;3;4;5;6;7
3;4;5;6;7;8
4;5;6;7;8;9
行的预期结果:
"test""and also newline<br/>here or something";2;3;4;5;6
"this happens also! ""<br/> here or something";2;3;4;5;6
有人可以帮我修复这个 perl 脚本,以便将 \n 替换为
吗?
谢谢
【问题讨论】:
-
能否请edit您的帖子并在此处添加(相关)代码?链接到场外代码对给出好的答案没有多大帮助。