【问题标题】：Comparing the 1st column of N files, if match found print the 1st file and 2nd column of remaining files比较 N 个文件的第一列，如果找到匹配，则打印剩余文件的第一列和第二列
【发布时间】：2019-12-13 21:07:47
【问题描述】：

我有许多文件，每个文件都有两列，并且想将这些文件与第一列进行比较。如果在所有文件中找到匹配项，则打印剩余文件的第 1 个文件和第 2 列。

输入示例

文件 1

 apple    tree
 great    see
 see      apple
 tree     bee
 make     change

文件 2

great    2
see      3
tree     4
make     5

文件 3

apple    10  
great    9
see      8
tree     7

预期输出

 great    see     2     9
 see      apple   3     8
 tree     bee     4     7

我只能处理两个文件。使用

 awk  'FNR==NR {a[$1]=$0; next}; $1 in a {print a[$1]}' file1 file2

【问题讨论】：

标签： python perl awk

【解决方案1】：

您能否尝试以下操作（这也将处理第一个字段的顺序，它们出现的顺序将在输出中出现）。

awk '
!c[$1]++{
  d[++count]=$1
}
{
  a[$1]++
  b[$1]=(b[$1]?b[$1] OFS:"")$NF
}
END{
  for(i=1;i<=count;i++){
    if(a[d[i]]==3){
       print d[i],b[d[i]]
    }
  }
}
'  file1 file2 file3  | column -t

【讨论】：

【解决方案2】：

如果所有文件中的第一列都包含唯一值，这应该可以解决问题：

$ awk '{a[$1]=a[$1]"\t"$2} ++n[$1]==3{print $1 a[$1]}' file1 file2 file3
great   see     2       9
see     apple   3       8
tree    bee     4       7

【讨论】：

【解决方案3】：

使用join：

$ join <(sort file1) <(sort file2) | join - <(sort file3)
great see 2 9
see apple 3 8
tree bee 4 7

【讨论】：