【问题标题】:CS50 pset 6 DNA works with small.csv but not large.csvCS50 pset 6 DNA 适用于 small.csv 但不适用于 large.csv
【发布时间】:2021-09-08 17:32:52
【问题描述】:

这是我的问题集第 6 周 DNA 的代码。当我使用 small.csv 进行测试时,它可以正常工作,但是在使用 large.csv 进行测试时,它似乎错误地计算了重复序列。谁能帮我找到我的代码中的错误?我对此很陌生。

import csv
import sys
if len(sys.argv) != 3:
        sys.exit("Usage: python dna.py STRcounts DNASequence")
check = True
STRlist = []
Humanlist = []
# copy person list
with open(sys.argv[1],"r") as STR:
    readSTR = csv.reader(STR)
    for row in readSTR:
        if check:
            STRlist.append(row)
            check = False
        else:
            Humanlist.append(row)
Slist = STRlist[0]
Slist.remove("name")
# print(Humanlist)
# print(Slist)
seq=[]
# copy sequence
with open(sys.argv[2],"r") as text:
    readtext = csv.reader(text)
    for i in readtext:
        seq = i
text = seq[0]
# print(text)
# create dictionary for STR

STRdict = {}
for STR in Slist:
    STRdict[STR] = 0
for STR in Slist:
    for letter in range(len(text)):
        if STR == text[letter:letter+len(STR)]:
            STRdict[STR] += 1
check = False
for human in range(len(Humanlist)):
    for STR in range(len(Slist)):
        if str(STRdict[Slist[STR]]) == str(Humanlist[human][STR+1]):
            check = True
        else:
            check = False
            break
    if check:
        print(Humanlist[human][0])
        break
if not check:
    print("no match")

【问题讨论】:

  • cs50是哪个学校的?你能比“似乎错误地计算重复序列”更具体吗?
  • 您能否提供一些样本 - 输入/输出?失败的案例呢?没有详细信息,很难为您提供帮助。

标签: python cs50 dna-sequence


【解决方案1】:

我注释掉了不必要的部分并添加了代码来获得max STR 重复的长度。你的代码的其余部分没有改变,我得到了预期的结果。

我没有检查所有代码以进行可能的改进,但它确实得到了正确的结果。

您的代码不正确的原因是它计算了字符串中 STR 的所有出现次数,而不是计算连续重复次数(然后找到最大重复次数)。

import csv
import sys
if len(sys.argv) != 3:
        sys.exit("Usage: python dna.py STRcounts DNASequence")
check = True
STRlist = []
Humanlist = []
# copy person list
with open(sys.argv[1],"r") as STR:
    readSTR = csv.reader(STR)
    for row in readSTR:
        if check:
            STRlist.append(row)
            check = False
        else:
            Humanlist.append(row)
Slist = STRlist[0]
Slist.remove("name")
# print(Humanlist)
# print(Slist)
seq=[]
# copy sequence
with open(sys.argv[2],"r") as text:
    readtext = csv.reader(text)
    for i in readtext:
        seq = i
text = seq[0]
# print(text)
# create dictionary for STR

STRdict = {}
"""
for STR in Slist:
    STRdict[STR] = 0"""
for STR in Slist:
    idx = 0
    max_= 0
    while idx < len(text):
        num_repeats = 0
        while STR == text[idx:idx+len(STR)]:
            num_repeats += 1
            idx += len(STR)
        if num_repeats > max_:
            max_ = num_repeats

        idx += 1
    STRdict[STR] = max_
    #print(STR, max_)
            
    """for letter in range(len(text)):
        if STR == text[letter:letter+len(STR)]:
            STRdict[STR] += 1"""
check = False
for human in range(len(Humanlist)):
    for STR in range(len(Slist)):
        if str(STRdict[Slist[STR]]) == str(Humanlist[human][STR+1]):
            check = True
        else:
            check = False
            break
    if check:
        print(Humanlist[human][0])
        break
if not check:
    print("no match")

这个问题来自Harvard problem

【讨论】:

    猜你喜欢
    • 2020-08-10
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2013-01-19
    • 2014-04-06
    • 2019-02-12
    • 1970-01-01
    相关资源
    最近更新 更多