【发布时间】:2015-10-01 16:54:18
【问题描述】:
尊敬的 stackoverflow 社区,
我有一个 2 列文件,如下所示:
Ccrux.00013.c0_g1_i1 .
Ccrux.00013.c0_g2_i1 .
Ccrux.00014.c0_g1_i1 .
Ccrux.00014.c0_g2_i1 .
Ccrux.00015.c0_g1_i1 .
Ccrux.00015.c0_g1_i1 GO:0005789^cellular_component^endoplasmic reticulum membrane`GO:0016021^cellular_component^integral component of membrane`GO:0005509^molecular_function^calcium ion binding`GO:0005506^molecular_function^iron ion binding`GO:0031418^molecular_function^L-ascorbic acid binding`GO:0016706^molecular_function^oxidoreductase activity, acting on paired donors, with incorporation or reduction of molecular oxygen, 2-oxoglutarate as one donor, and incorporation of one atom each of oxygen into both donors`GO:0045646^biological_process^regulation of erythrocyte differentiation
Ccrux.00015.c0_g2_i1 GO:0005789^cellular_component^endoplasmic reticulum membrane`GO:0016021^cellular_component^integral component of membrane`GO:0005509^molecular_function^calcium ion binding`GO:0005506^molecular_function^iron ion binding`GO:0031418^molecular_function^L-ascorbic acid binding`GO:0016706^molecular_function^oxidoreductase activity, acting on paired donors, with incorporation or reduction of molecular oxygen, 2-oxoglutarate as one donor, and incorporation of one atom each of oxygen into both donors`GO:0045646^biological_process^regulation of erythrocyte differentiation
Ccrux.00016.c0_g1_i1 .
Ccrux.00016.c0_g2_i1 .
Ccrux.00017.c0_g1_i1 .
Ccrux.00018.c0_g1_i1 .
Ccrux.00019.c0_g1_i1 .
我需要一个新的 2 列文件:
- 不包含第 2 列值为 .的行。
- 仅包含 GO:XXXXXXX 作为第 2 列值(即从第 2 列中删除所有文本并仅保留 GO 编号)
新文件应如下所示:
Ccrux.00015.c0_g1_i1 GO:0005789,GO:0016021,GO:0005509,GO:0005506,GO:0031418,GO:0016706,GO:0045646
Ccrux.00015.c0_g2_i1 GO:0005789,GO:0016021,GO:0005509,GO:0005506,GO:0031418,GO:0016706,GO:0045646
Ccrux.00029.c0_g1_i1 GO:0035869,GO:0005737,GO:0005615,GO:0016020,GO:0021956,GO:0060271,GO:0021904,GO:0001701,GO:0001841,GO:0008589,GO:0021523,GO:0021537
我一直在尝试使用 perl:
perl -ne '/(GO:\d+)/ && print "$1"' input.file > output.file
但是只在一列中打印出我所有的 GO 数字。我真的不知道该怎么做。任何建议都将受到欢迎。
提前谢谢大家。
【问题讨论】:
标签: perl selection text-extraction