如何从perl中的两个txt文件中读取变量行并将它们写入另一个txt文件答案

【问题标题】：how to read variable lines from two txt files in perl and write them in to another txt file如何从perl中的两个txt文件中读取变量行并将它们写入另一个txt文件
【发布时间】：2013-08-11 20:22:19
【问题描述】：

我想自动化以下 Perl 代码。这样它将从两个 txt 文件中读取可变数量的行并将这些行写入另一个 txt 文件。

我有两个 txt 文件 (1st_file.txt) 和 (2nd_file.txt)，我想读取变量号。这些文件中的行数。我怎样才能做到这一点？下面给出的 Perl 代码做同样的工作，但如果我更改 txt 文件，我还需要更改我的 Perl 代码，这不是很有效。

那么任何人都可以指导我如何为下面给出的问题编写高效的 Perl 代码吗？因此，如果我更改我的 txt 文件中的数据，我会得到我想要的结果，但不会更改 Perl 代码。

这里的变化是什么意思？这意味着假设我从 207 --> A_207_P2_M2A --> T_207_P2_M2A 的 1st_file.txt 中删除第 5 到 7 行，并且我还从 P2_M2A 的 2nd_file.txt 中删除第 4 行。所以现在在删除这些行之后，我也想在我的 Perl 代码中进行更改以获得所需的结果，因为我删除了这些行。但是我想要一个 Perl 代码，如果我相应地对两个 txt 文件进行一些更改，我不需要在其中进行任何修改。

Perl 代码：

use warnings;
use strict;

open (FILE1, 'g:\perl_tests\1st_file.txt');
open (FILE2, 'g:\perl_tests\2nd_file.txt');
open (FILE3, '> g:\perl_tests\3rd_file.txt');

my @speech1 = <FILE1>;
my @speech2 = <FILE2>;

print FILE3 @speech2[0..1];
print FILE3 @speech1[1..2];
print FILE3 @speech1[5..6];
print FILE3 @speech2[4..6];
print FILE3 @speech1[9..10];
print FILE3 @speech2[8..10];
print FILE3 @speech1[13..14];
print FILE3 @speech2[12..14];
print FILE3 @speech1[17..18];
print FILE3 @speech2[16..18];
print FILE3 @speech1[21..22];
print FILE3 @speech1[25..26];
print FILE3 @speech1[29..30];

1st_file.txt

153
A_153_P1_M2A_Some text is written here
T_153_P1_M2A_Some text is written here

207
A_207_P2_M2A_Some text is written here
T_207_P2_M2A_Some text is written here

48
A_48_P1_T1B_Some text is written here
T_48_P1_T1B_Some text is written here

57
A_57_P1_T2A_Some text is written here
T_57_P1_T2A_Some text is written here

167
A_167_P1_W1C_Some text is written here
T_167_P1_W1C_Some text is written here

26
A_26_P1_W2B_Some text is written here
T_26_P1_W2B_Some text is written here

183
A_183_P2_W2B_Some text is written here
T_183_P2_W2B_Some text is written here

69
A_69_P3_W2B_Some text is written here
T_69_P3_W2B_Some text is written here

2nd_file.txt

M2A
Top_M2A
P1_M2A
P2_M2A

T1B
Top_T1B
P1_T1B

T2A
Top_T2A
P1_T2A

W1C
Top_W1C
P1_W1C

W2B
Top_W2B
P1_W2B
P2_W2B
P3_W2B

3rd_file.txt（输出：由Perl代码生成，应该是下面给出的那个）

M2A
Top_M2A
A_153_P1_M2A_Some text is written here
T_153_P1_M2A_Some text is written here
A_207_P2_M2A_Some text is written here
T_207_P2_M2A_Some text is written here

T1B
Top_T1B
A_48_P1_T1B_Some text is written here
T_48_P1_T1B_Some text is written here

T2A
Top_T2A
A_57_P1_T2A_Some text is written here
T_57_P1_T2A_Some text is written here

W1C
Top_W1C
A_167_P1_W1C_Some text is written here
T_167_P1_W1C_Some text is written here

W2B
Top_W2B
A_26_P1_W2B_Some text is written here
T_26_P1_W2B_Some text is written here
A_183_P2_W2B_Some text is written here
T_183_P2_W2B_Some text is written here
A_69_P3_W2B_Some text is written here
T_69_P3_W2B_Some text is written here

谁能指导我解决这个问题。

【问题讨论】：

根据什么标准选择打印哪些行以及何时打印？没有这样的规则，你就无法做出你想要的。
您需要检查来自open 的返回码。如果由于某种原因您的open 呼叫之一失败，您将不知道。这样做：open (FILE1, 'g:\perl_tests\1st_file.txt') or die $!，因为$! 是错误消息变量。

标签： perl

【解决方案1】：

您似乎是按下划线后的最后一部分对行进行分组。有点不清楚应该以什么顺序打印这些行（例如，如果 P1_M2A 在第二个文件中出现在 P2_M2A 之后），但以下代码给出了您提供的数据的预期输出。

它首先将 1st_file 读入散列，记住每个 id 没有第一行的段落（_ 之后的最后一部分）。然后，它遍历第二个文件并在打印“标题”后打印记住的行。它只测试每段第三行的id，其余行被忽略。如上所述，您尚未指定如何获取 id。如果最后一部分很重要，您将不得不稍微调整代码。

#!/usr/bin/perl
use warnings;
use strict;

open my $F1,  '<', '1st_file.txt' or die $!;
my %hash1;
my $num;
while (<$F1>) {
    if (my ($id) = /^[AT]_[0-9]+_.+?_(.*)/) {
        $hash1{$id} .= $_;
    }
}

open my $F2,  '<', '2nd_file.txt' or die $!;
open my $OUT, '>', '3rd_file.txt' or die $!;
while (<$F2>) {
    if (!/_/ or /^Top_/) {
        print $OUT $_;
    } else {
        if (my ($id) = /_(.*)/) {
            print $OUT $hash1{$id} if exists $hash1{$id};
            delete $hash1{$id};
        }
    }
}
close $OUT or die $!;

更新以反映您更详细的规范：

#!/usr/bin/perl
use warnings;
use strict;

open my $F1,  '<', '1st_file.txt' or die $!;
my %hash1;
my $num;
while (<$F1>) {
    if (my ($id) = /^[AT]_[0-9]+_(.*)$/) {
        $hash1{$id} .= $_;
    }
}

open my $F2,  '<', '2nd_file.txt' or die $!;
open my $OUT, '>', '3rd_file.txt' or die $!;
while (<$F2>) {
    if (!/_/ or /^Top_/) {
        print $OUT $_;
    } else {
        chomp;
        print $OUT $hash1{$_} if exists $hash1{$_};
    }
}
close $OUT or die $!;

【讨论】：

亲爱的 choroba 首先，您在上面提供的代码不符合我的要求。它只是给出 2nd_file.txt 作为输出。其次，P1_M2A 总是在 P2_M2A 之前。让我再解释一下。
首先我想打印 2nd_file.txt 的前两行，就像“M2A”和“Top_M2A”。可以将其想象为章节编号和章节名称。之后，2nd_file.txt 中的下一个标题是“P1_M2A”。所以此时我不想在输出中打印 P1_M2A，而是从 1st_file.txt 打印“A_153_P1_M2A”和“T_153_P1_M2A”。之后我想回到 2nd_file.txt，下一行是“P2_M2A”，所以此时我将回到 1st_file.txt，我想分别打印“A_207_P2_M2A”和“T_207_P2_M2A”。
在 2nd_file.txt 中有空行之后，在空行之后提到另一个章节号和章节名称，就像“T1B”和“Top_T1B”所以从 2nd_file.txt 打印这些后我需要从 2nd_file 读取“P1_T1B”的下一行，现在我需要转到 1st_file .txt，我想分别打印“A_48_P1_T1B”和“T_48_P1_T1B”。请查看输出文件，以便您更好地理解。
@user1761275：我得到了完全预期的输出。
非常感谢您的帮助。它工作正常，但实际上并非如此。为什么不。让我解释。在文件“1st_file.txt”中，我写了一些文本，而不是第 2 行“A_153_P1_M2A”和第 3 行“T_153_P1_M2A”。因此，如果您转到“1st_file.txt”，153 之后的下一行实际上是文本，例如章节作者，而下一行就像章节标题。所以当我尝试运行原始文件时它不起作用。它仅适用于上述文件。那么你能指导我如何解决这个问题吗？