Perl，提取特定列答案

【问题标题】：Perl, extract specific columnsPerl，提取特定列
【发布时间】：2016-04-30 07:37:47
【问题描述】：

请帮忙。

我有两个文件（file1 和 file2）。我想从 file2 中提取其 ID 列在 file1 中的列。这些文件很大，有数千列和数千行。

文件1

Id123B
Id124A
Id125A

文件2

Code  sex  id123B  id127  id125A

所需的输出文件：

code sex id123B  id125A

以下是我尝试过的代码，但是失败了。

!/usr/bin/perl
use strict;
use warnings;

open my $IN, "file2" or die $!;

my $header = <$IN>;

my %sampleID = map { /(.*?)\t/; $1 => 1 } <$IN>;

close($IN);

open $IN, "file1" or die $!;
$header = <$IN>;
my @samples = split /\t/, $header;
my @cols = grep { exists $sampleID{$samples[$_]} } 0..$#samples;


while(<$IN>){
    chomp;
    my @line = (split /\t/)[@cols]; 

    print join( "\t", @line ), "\n";
}

【问题讨论】：

标签： perl shell

【解决方案1】：

使用哈希将列名映射到列号。

#!/usr/bin/perl
use warnings;
use strict;
use feature qw{ say };

open my $COLUMNS, '<', shift or die $!;
chomp( my @columns = <$COLUMNS> );

open my $DATA, '<', shift or die $!;
my @header = split /\t/, <$DATA>;
my %column_index;
@column_index{ @header } = 0 .. $#header;

@columns = grep exists $column_index{$_}, @columns;

while (<$DATA>) {
    chomp( my @cells = split /\t/ );
    say join "\t", @cells[ @column_index{ @columns } ];
}

以script.pl file1 file2 运行。请注意，您必须在文件中使用确切的列名，即我使用以下文件1获得了更好的结果：

Code
sex
id123B
id124A
id125A

【讨论】：

非常感谢您的帮助。我刚刚运行它，它没有打印任何输出。也许是数据格式？
@El.h 可能对我有用。检查列名。