【问题标题】:How to show only the first 20 entries如何仅显示前 20 个条目
【发布时间】:2020-06-13 19:00:03
【问题描述】:

这里的 Python 代码可以得到我想要的输出。但是,我需要帮助将结果限制为前 20 行。

输入示例如下,

gi|170079688|ref|YP_001729008.1|双功能核黄素激酶/FMN 腺苷酸转移酶 [Escherichia coli str. K-12 substr。 DH10B] MKLIRGIHNLSQAPQEGCVLTIGNFDGVHRGHRALLQGLQEEGRKRNLPVMVMLFEPQPLELFATDKAPA RLTRLREKLRYLAECGVDYVLCVRFDRRFAALTAQNFISDLLVKHLRVKFLAVGDDFRFGAGREGDFLLL QKAGMEYGFDITSTQTFCEGGVRISSTAVRQALADDNLALAESLLGHPFAISGRVVHGDELGRTIGFPTA NVPLRRQVSPVKGVYAVEVLGLGEKPLPGVANIGTRPTVAGIRQQLEVHLLDVAMDLYGRHIQVVLRKKI RNEQRFASLDELKAQIARDELTAREFFGLTKPA gi|170079689|ref|YP_001729009.1|异亮氨酰-tRNA合成酶[大肠杆菌str。 K-12 substr。 DH10B] MSDYKSTLNLPETGFPMRGDLAKREPGMLARWTDDDLYGIIRAAKKGKKTFILHDGPPYANGSIHIGHSV NKILKDIIVKSKGLSGYDSPYVPGWDCHGLPIELKVEQEYGKPGEKFTAAEFRAKCREYAATQVDGQRKD FIRLGVLGDWSHPYLTMDFKTEANIIRALGKIIGNGHLHKGAKPVHWCVDCRSALAEAEVEYYDKTSPSI DVAFQAVDQDALKAKFAVSNVNGPISLVIWTTTPWTLPANRAISIAPDFDYALVQIDGQAVILAKDLVES VMQRIGVTDYTILGTVKGAELELLRFTHPFMGFDVPAILGDHVTLDAGTGAVHTAPGHGPDDYVIGQKYG LETANPVGPDGTYLPGTYPTLDGVNVFKANDIVVALLQEKGALLHVEKMQHSYPCCWRHKTPIIFRATPQ WFVSMDQKGLRAQSLKEIKGVQWIPDWGQARIESMVANRPDWCISRQRTWGVPMSLFVHKDTEELHPRTL ELMEEVAKRVEVDGIQAWWDLDAKEILGDEADQYVKVPDTLDVWFDSGSTHSSVVDVRPEFAGHAADMYL EGSDQHRGWFMSSLMISTAMKGKAPYRQVLTHGFTVDGQGRKMSKSIGNTVSPQDVMNKLGADILRLWVA STDYTGEMAVSDEILKRAADSYRRIRNTARFLLANLNGFDPAKDMVKPEEMVVLDWAVGCAKAAQEDIL KAYEAYDFHEVVQRLMRFCSVEMGSFYLDIIKDRQYTAKADSVARRSCQTALYHIAEALVRWMAPILSFT ADEVWGYLPGEREKYVFTGEWYEGLFGLAGLADSEAMNDAFWDELLKVRGEVNKVIEQARADKKVGGSLEAAV TLYAEPELSAKLTALGDELRFVLLTSGATVADYNDAPADAQQSEVLKGLKVALSKAEGEKCPRCWHYTQD VGKVAEHAEICGRCVSNVAGDGEKRKFA gi|170079690|ref|YP_001729010.1|脂蛋白信号肽酶 [Escherichia coli str. K-12 substr。 DH10B] MSQSICSTGLRWLWLVVVVLIIDLGSKYLILQNFALGDTVPLFPSLNLHYARNYGAAFSFLADSGGWQRW FFAGIAIGISVILAVMMYRSKATQKLNNIAYALIIGGALGNLFDRLWHGFVVDMIDFYVGDWHFATFNLA DTAICVGAALIVLEGFLPSRAKKQ

import re

id = None
header = None
seq = ''

a_file = open('e_coli.faa')

for line in a_file:
    m = re.match(">(\S+)\s+(.+)", line.rstrip())
    if m:
        if id is not None:

            print("{0} length:{1} {2}".format(id, len(seq),header))

        id, header = m.groups()
        seq = ''
    else:
        seq += line.rstrip()

【问题讨论】:

  • a_file 上使用slicing 运算符。

标签: python python-3.x output python-re


【解决方案1】:

在最顶部添加c = 0。然后,改变

        print("{0} length:{1} {2}".format(id, len(seq),header))

        if c < 10:
            print("{0} length:{1} {2}".format(id, len(seq),header))
            c += 1

稍作调整的结果:

import re

id = None
header = None
seq = ''

with open('e_coli.faa') as a_file:
    for line in a_file:
        m = re.match(">(\S+)\s+(.+)", line.rstrip())
        if m:
            if id and c < 20:
                print("{0} length:{1} {2}".format(id, len(seq),header))
                c += 1

            id, header = m.groups()
            seq = ''
        else:
            seq += line.rstrip()

读取文件的前 20 行。 你可以使用readlines():

代替:

for line in a_file:

使用:

for line in a_file.readlines()[:20]:

【讨论】:

  • 这会读取输入文件的前 20 行,但我需要打印前 20 行结果。
  • 我实际上早先尝试过这种方法,但它在打印行上给我一个错误“缩进中制表符和空格的使用不一致”。
猜你喜欢
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 2012-05-06
  • 2012-10-23
  • 1970-01-01
  • 1970-01-01
  • 2018-03-20
  • 1970-01-01
相关资源
最近更新 更多