【发布时间】:2026-01-24 22:20:04
【问题描述】:
我有一个我似乎无法找到并解决的问题。
FASTA = >header1
ATCGATCGATCCCGATCGACATCAGCATCGACTAC
ATCGACTCAAGCATCAGCTACGACTCGACTGACTACGACTCGCT
>header2
ATCGATCGCATCGACTACGACTACGACTACGCTTCGTATCAGCATCAGCT
ATCAGCATCGACGACGACTAGCACTACGACTACGACGATCCCGATCGATCAGCT
def dnaSequence():
'''
This function makes a dict called DNAseq by reading the fasta file
given as first argument on the command line
INPUT: Fasta file containing strings
OUTPUT: key is header and value is sequence
'''
DNAseq = {}
for line in FASTA:
line = line.strip()
if line.startswith('>'):
header = line
DNAseq[header] = ""
else:
seq = line
DNAseq[header] = seq
return DNAseq
def digestFragmentsWithOneEnzyme(dnaSequence):
'''
This function digests the sequence from DNAseq into smaller parts
by using the enzymes listed in the MODES.
INPUT: DNAseq and the enzymes from sys.argv[2:]
OUTPUT: The DNAseq is updated with the segments gained from the
digesting
'''
enzymes = sys.argv[2:]
updated_list = []
for enzyme in enzymes:
pattern = MODES(enzyme)
p = re.compile(pattern)
for dna in DNAseq.keys():
matchlist = re.findall(p,dna)
updated_list = re.split(MODES, DNAseq)
DNAseq.update((key, updated_list.index(k)) for key in
d.iterkeys())
return DNAseq
def getMolecularWeight(dnaSequence):
'''
This function calculates the molWeight of the sequence in DNAseq
INPUT: the updated DNAseq from the previous function as a dict
OUTPUT: The DNAseq is updated with the molweight of the digested fragments
'''
results = []
for seq in DNAseq.keys():
results = sum((dnaMass[base]) for base in DNAseq[seq])
DNAseq.update((key, results.index(k)) for key in
d.iterkeys())
return DNAseq
def main(argv=None):
'''
This function prints the results of the digested DNA sequence on in the terminal.
INPUT: The DNAseq from the previous function as a dict
OUTPUT: name weight weight weight
name2 weight weight weight
'''
if argv == None:
argv = sys.argv
if len(argv) <2:
usage()
return 1
digestFragmentsWithOneEnzyme(dnaSequence())
Genes = getMolecularWeight(digestFragmentsWithOneEnzyme())
print ({header},{seq}).format(**DNAseq)
return 0
if __name__ == '__main__':
sys.exit(main())
在第一个函数中,我试图从 fasta 文件中创建一个 dict,在第二个函数中使用相同的 dict,其中序列被正则表达式切片,最后计算 molweight。
我的问题是,由于某种原因,Python 无法识别我的dict,并且出现错误:
名称错误 DNAseq 未定义
如果我在函数之外创建dict,那么我确实拥有dict。
【问题讨论】:
-
请修复您的代码块。
标签: python regex dictionary fasta