【问题标题】:filter and create table过滤并创建表
【发布时间】:2020-10-15 16:29:14
【问题描述】:

我正在尝试从文本创建表格,但无法得到我想要的结果。

这是正文:

读取序列文件 /scratch/mauve_pro/populations/africa/rebuild_xmfa/block_fasta/africa_final_remove_new.100.fasta

NSS:                 1.60e-02  (1000 permutations)
Max Chi^2:           6.20e-02  (1000 permutations)
PHI (Permutation):   5.28e-01  (1000 permutations)
PHI (Normal):        4.73e-01

读取序列文件 /scratch/mauve_pro/populations/africa/rebuild_xmfa/block_fasta/africa_final_remove_new.101.fasta

NSS:                 8.52e-01  (1000 permutations)
Max Chi^2:           2.20e-02  (1000 permutations)
PHI (Permutation):   3.78e-01  (1000 permutations)
PHI (Normal):        4.53e-01

这是代码:

cat africa_final_conca_phi.txt | sed 's/NSS/NSS NA/g' | grep -E "^NSS|^Max|^PHI" | awk '{{print $1 "\t" $2 "\t" $3 "\t{genome_stem}\t{single_gene_stem}"}}' > phiresult.tab

我尝试的另一个代码:

cat africa_final_conca_phi.txt | sed 's/NSS/NSS NA/g' | grep -E "^NSS|^Max|^PHI" | gene=$(printf '%s' "$africa_final_conca_phi.txt" | grep -Eo 'africa_final_remove_new.[0-9][0-9][0-9].fasta') | awk '{{print $1 "\t" $2 "\t" $3 "\t gene \t{single_gene_stem}"}}' > phiresult.tab

结果:

NSS NA: 1.60e-02    {genome_stem} 
Max Chi^2:  6.20e-02    {genome_stem}   
PHI (Permutation):  5.28e-01    {genome_stem}   
PHI (Normal):   4.73e-01    {genome_stem}   
NSS NA: 8.52e-01    {genome_stem}
Max Chi^2:  2.20e-02    {genome_stem}   
PHI (Permutation):  3.78e-01    {genome_stem}
PHI (Normal):   4.53e-01    {genome_stem}
NSS NA: 0.00e+00    {genome_stem}

我想要什么:

NSS NA: 1.60e-02    africa_final_remove_new.100.fasta 
Max Chi^2:  6.20e-02    africa_final_remove_new.100.fasta
PHI (Permutation):  5.28e-01    africa_final_remove_new.100.fasta   
PHI (Normal):   4.73e-01    africa_final_remove_new.100.fasta
NSS NA: 8.52e-01    africa_final_remove_new.101.fasta
Max Chi^2:  2.20e-02    africa_final_remove_new.101.fasta
PHI (Permutation):  3.78e-01    africa_final_remove_new.101.fasta
PHI (Normal):   4.53e-01    africa_final_remove_new.101.fasta

【问题讨论】:

    标签: regex linux awk sed grep


    【解决方案1】:

    这个 oneliner 可能会有所帮助:

    awk '{sub(/[(][^)]*[)]$/,"")}/^NSS/{$1="NSS AN:"}{$(NF+1)=FILENAME}7' *.fasta
    

    让我们来测试一下:

    kent$  head *.fasta  
    ==> 100.fasta <==
    NSS:                 1.60e-02  (1000 permutations)
    Max Chi^2:           6.20e-02  (1000 permutations)
    PHI (Permutation):   5.28e-01  (1000 permutations)
    PHI (Normal):        4.73e-01
    
    ==> 101.fasta <==
    NSS:                 8.52e-01  (1000 permutations)
    Max Chi^2:           2.20e-02  (1000 permutations)
    PHI (Permutation):   3.78e-01  (1000 permutations)
    PHI (Normal):        4.53e-01
    
    kent$  awk  '{sub(/[(][^)]*[)]$/,"")}/^NSS/{$1="NSS AN:"}{$(NF+1)=FILENAME}7' *.fasta
    NSS AN: 1.60e-02 100.fasta
    Max Chi^2: 6.20e-02 100.fasta
    PHI (Permutation): 5.28e-01 100.fasta
    PHI (Normal): 4.73e-01 100.fasta
    NSS AN: 8.52e-01 101.fasta
    Max Chi^2: 2.20e-02 101.fasta
    PHI (Permutation): 3.78e-01 101.fasta
    PHI (Normal): 4.53e-01 101.fasta
    

    【讨论】:

    • 谢谢,帮了大忙
    • @AlexGZ 如果解决了您的问题,请接受答案
    猜你喜欢
    • 2011-03-16
    • 2016-09-18
    • 1970-01-01
    • 1970-01-01
    • 2021-07-19
    • 2019-08-03
    • 2021-04-01
    • 1970-01-01
    • 1970-01-01
    相关资源
    最近更新 更多