【问题标题】:Read specific column in perl读取 perl 中的特定列
【发布时间】:2017-09-23 12:06:20
【问题描述】:

我是 perl 新手。我有下面的文本文件,从那里我只想要一个 Time 列,下一列是值。如何在 perl 中创建带有我想要的输出的文本文件。

Time  Value    Time    Value    Time    Value
1   0.353366497 1   0.822193251 1   0.780866396
2   0.168834182 2   0.865650713 2   0.42429447
3   0.323540698 3   0.865984245 3   0.856875894
4   0.721728497 4   0.634773162 4   0.563059042
5   0.545131335 5   0.029808531 5   0.645993399
6   0.143720835 6   0.949973296 6   0.14425803
7   0.414601876 7   0.53421424  7   0.826148814
8   0.194818367 8   0.942334356 8   0.837107013
9   0.291448263 9   0.242588271 9   0.939609775
10  0.500159997 10  0.428897293 10  0.41946448 

我试过下面的代码:

use strict;
use warnings;
use IO::File;
my $result;
my @files = (q[1.txt],q[2.txt],q[3.txt]);
my @fhs = ();
foreach my $file (@files) { 
    my $fh = new IO::File $file, O_RDONLY;
    push @fhs, $fh if defined $fh;
} 

while(1) { 
    my @lines = map { $_->getline } @fhs;
    last if grep { not defined $_ } @lines[0..(@fhs-1)];
    my @result=join(qq[\t], map { s/[\r?\n]+/ /g; $_ } @lines ) . qq[\r\n];
    open (MYFILE, '>>Result.txt');
    print (MYFILE "@result");
    close (MYFILE);
}

【问题讨论】:

  • 我没有看到任何代码。最好更新您的问题。通常你应该提供两个输入数据。预期输出。并且至少尝试解决问题的代码
  • 我可以将三个文本文件合并为一个。现在我有 6 列文本文件,从那里我想只显示前四列。我该怎么做?
  • 再一次,如果不查看您的代码并知道您拥有的代码,这很难说。不管你可能想看看 split()
  • 我试过下面的代码:use strict; use warnings; use IO::File; my$result; my @files = (q[1.txt],q[2.txt],q[3.txt]); my @fhs = (); foreach my $file (@files) { my $fh = new IO::File $file, O_RDONLY; push @fhs, $fh if defined $fh; } while(1) { my @lines = map { $_->getline } @fhs; last if grep { not defined $_ } @lines[0..(@fhs-1)]; my@result=join(qq[\t], map { s/[\r?\n]+/ /g; $_ } @lines) . qq[\r\n]; open (MYFILE, '>>Result.txt'); print (MYFILE "@result"); close (MYFILE); }
  • @James: (@fhs-1) $#fhs

标签: perl


【解决方案1】:

我会选择split

use warnings;
use strict;

open (my $f, '<', 'your-file.dat') or die;

while (my $line = <$f>) {
  my @elems = split ' ', $line;
  print join "\t", @elems[0,1,3,5];
  print "\n";
}

【讨论】:

  • 如果要将其格式化为整齐的列,以便标题与下面的数据对齐,请将 while 循环中的 2 行 print 替换为 printf("%4s %-15s %-15s %-15s\n", @elems[0,1,3,5]);
【解决方案2】:

这是一个单行;无需编写脚本:

$ perl -lanE '$,="\t"; say @F[0,1,3,5]' 1.txt 2.txt 3.txt

如果你喜欢,你可以把它缩短为:

$ perl -lanE '$,="\t"; say @F[0,1,3,5]' [123].txt

【讨论】:

  • 谢谢@William Pursell
【解决方案3】:

现在,您只是将文件的行连接在一起。如果这不能为您提供您喜欢的输出,您需要删除一些列。

由于您的输出看起来像是有制表符分隔的文件作为输入,因此我将输入的行拆分为制表符。而且由于您只想要第二列,因此我只在拆分的第一个偏移量处取列。

my $line_num = 0;
while(1) {
    my @lines = map { $_->getline } @fhs;
    last if grep { not defined $_ } @lines[0..$#fhs];
    $line_num++;
    my @rows     = map { [ split /\t/ ] } @lines;
    my $time_val = $rows[0][0];
    die "Time values are not all equal on line #$line_num!" 
        if grep { $time_val != $_->[0] } @rows
        ;
    my $result = join( q[\t], $time_val, map { $_->[1] } @rows );
    open (MYFILE, '>>Result.txt');
    print (MYFILE "$result\n");
    close (MYFILE);
}

当然,没有理由进行自定义编码来拆分分隔列:

use Text::CSV; 
...
my $csv  = Text::CSV->new( { sep_char => "\t" } );

while(1) { 
    my @rows = map { $csv->getline( $_ ) } @fhs;
    last if grep { not defined $_ } @rows[0..$#fhs];
    my ( $time_val, @time_vals ) = map { $_->[0] } @rows;
    my @values = map { $_->[1] } @rows;
    die "Time values are not all equal on line #$line_num!" 
        if grep { $time_val != $_ } @time_vals
        ;
    my $result = join( q[\t], $time_val, @values );
    ...
}

【讨论】:

    【解决方案4】:
        use strict;    
    use warnings;    
    open(FH,"<","a.txt");    
    print "=========== A File content =========== \n";    
    my $a = `cat a.txt`; 
    print "$a\n";  
    
    my @temp = <>;  
    
        my (@arr, @entries, @final);  
        foreach ( @temp ) {  
             @arr = split ( " ", $_ );  
             push @entries, @arr;  
    }  
    close FH;
    
    my @entries1 = @entries;  
    
    for(my $i = 7; $i<=$#entries; $i=$i+2) {  
    
            push @final, $entries[$i];
    }
    
    my $size = scalar @final;
    
    open FH1, ">", "b.txt";  
    print FH1 "Time \t Value\n";  
    for(my $i = 0; $i < $size; $i++) {  
    
            my $j = $i+1;  
            print FH1 "$j \t $final[$i]\n";  
    }
    
    close FH1;    
    
    print "============ B file content ===============\n";    
    my $b = `cat b.txt`;    
    print "$b";
    
    
    O/P:    
    =========== A File content ===========  
    Time  Value    Time    Value    Time    Value    
    1   0.353366497 1   0.822193251 1   0.780866396   
    2   0.168834182 2   0.865650713 2   0.42429447  
    3   0.323540698 3   0.865984245 3   0.856875894  
    4   0.721728497 4   0.634773162 4   0.563059042  
    5   0.545131335 5   0.029808531 5   0.645993399  
    6   0.143720835 6   0.949973296 6   0.14425803  
    7   0.414601876 7   0.53421424  7   0.826148814  
    8   0.194818367 8   0.942334356 8   0.837107013  
    9   0.291448263 9   0.242588271 9   0.939609775  
    10  0.500159997 10  0.428897293 10  0.41946448  
    
    ============ B file content ===============  
    Time     Value  
    1        0.353366497  
    2        0.822193251  
    3        0.780866396  
    4        0.168834182  
    5        0.865650713  
    6        0.42429447  
    7        0.323540698  
    8        0.865984245  
    9        0.856875894  
    10       0.721728497  
    11       0.634773162  
    12       0.563059042  
    13       0.545131335  
    14       0.029808531  
    15       0.645993399  
    16       0.143720835  
    17       0.949973296  
    18       0.14425803  
    19       0.414601876   
    20       0.53421424  
    21       0.826148814  
    22       0.194818367  
    23       0.942334356  
    24       0.837107013  
    25       0.291448263  
    26       0.242588271  
    27       0.939609775  
    28       0.500159997  
    29       0.428897293  
    30       0.41946448  
    

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 1970-01-01
      • 2014-07-25
      • 2012-01-22
      • 2011-07-20
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 2021-12-24
      相关资源
      最近更新 更多