【问题标题】:Python or Perl Parse Text report file to CVS [closed]Python 或 Perl 将文本报告文件解析为 CVS [关闭]
【发布时间】:2013-10-28 00:18:16
【问题描述】:

我有一些数据需要解析为制表符分隔的文本文件。数据如下:

>beer/name: Sausa Weizen beer/beerId: 47986 beer/brewerId: 10325 beer/ABV: 5.00 beer/style: Hefeweizen review/appearance: 2.5
> review/aroma: 2 review/palate: 1.5 review/taste: 1.5 review/overall:
> 1.5 review/time: 1234817823 review/profileName: stcules review/text: A lot of foam. But a lot.    In the smell some banana, and then lactic and
> tart. Not a good start.   Quite dark orange in color, with a lively
> carbonation (now visible, under the foam).    Again tending to lactic
> sourness. Same for the taste. With some yeast and banana.     
> 
> beer/name: Red Moon ...repeats millions of times...

` 我需要它看起来像这样:

Sausa Weizen {tab} 47986 {tab} 10325 {tab} ...

有没有人有一些示例 perl 代码可以用来入门?我是 Perl 的新手,我修改了一些在网站上找到的其他示例,但无法让它们在我的上下文中工作。

我尝试在 Vim 中使用正则表达式以及以下 perl:

#!/usr/bin/perl
#parse_file_kv.pl
use strict;
use warnings;
my $save_input_record_separator = $/; #Save original value before changing it
undef $/; # enable slurp mode
open(my $file ,"ratebeer.txt");
$/ = $save_input_record_separator; #Restore original value to this global variable
my %h = $file =~ m/\w+/g;#Read keys and values from file into hash %h
for (keys %h){
    print "KeyWord $_ has value $h{$_}.\n";
}
print "\n";
my @kws2find = qw(beer/name);
foreach ( @kws2find ){
    find_value($_);
}
sub find_value{
    my $kw = shift @_;
    if (exists $h{$kw}){
        print "Value of $kw is $h{$kw}\n";
    }else{
        print "Keyword $kw is not found in hash\n";
    }
}

【问题讨论】:

  • 你能展示你的尝试吗?

标签: python perl parsing


【解决方案1】:

在 Perl 中,有很多方法可以做到这一点,但我会给出最简单的:

# a sample input line.  In reality you would read it from a file and chomp off the \n.
my $foo = "beer/name: Sausa Weizen beer/beerId: 47986 ...\n";

# replace foo/bar: with a tab everywhere in the line.  
# I used A-Za-z as the chars to match, you can do many more things (including more
# elegant ways of specifying whole character classes).
#
$foo =~ s/[A-Za-z]*\/[a-zA-Z]*:/\t/g;

# print it out.
print "$foo\n";

【讨论】:

    猜你喜欢
    • 2017-11-06
    • 1970-01-01
    • 2013-03-06
    • 2013-06-03
    • 2013-10-12
    • 1970-01-01
    • 2014-09-10
    • 2011-04-29
    • 1970-01-01
    相关资源
    最近更新 更多