【发布时间】:2017-06-01 14:46:15
【问题描述】:
我仍然是初学者,所以对于初学者来说,对于这个问题可能有一个明显的答案感到抱歉,并且对于混乱的代码感到抱歉,但是我有包含一万行的文件。我正在使用某种窗口框架技术来滑动我的文件,所以我需要确保每个窗口都在那里。但是,我的一些输入文件遗漏了某些行,所以我尝试用 Python 编写代码来添加这些行和我想要的信息,以使文件完整。代码是这样的:
#!/usr/bin/env python
outfile = open ("missing_test.txt", "w")
with open("add_missing.txt", "r") as file:
last_line = 0 #This is where it starts for bin 1
lines = []
header_line = next(file)
outfile.write(header_line)
CHROM = 'BABA_1'
for line in file: #go through every line to check its existence and rewrite to new file
nums = line.split("\t")
num1 = nums[0] #no integer because this is a string: name individual
num2 = int(nums[1]) #integer for window
num3 = int(nums[2]) #integer for coverage (here always 10000 to met treshold)
num4 = int(nums[3]) #integer for SNP count
if num1 == CHROM: #
while num2 != last_line + 10000:
#A line is missing, so a new line is added with 0 SNPs:
NUM2 = last_line + 10000 # New window, the one that was missing
NUM4 = 0 #0 SNPs found
#lines.append((num1, NUM2, num3, NUM4))
OUTLINE = "%s\t%s\t%s\t%s" % (num1, NUM2, num3, NUM4) #write new line to outfile
outfile.write(OUTLINE + "\n")
last_line += 10000
lines.append((num1,num2,num3,num4))
last_line += 10000 #also add 10000 here otherwise the while loop makes no sense
outline = "%s\t%s\t%s\t%s" % (num1, num2, num3, num4)
outfile.write(outline + "\n") #write all existing lines to outfile
else:
CHROM = num1
last_line = 0
outfile.close()
所以只要第一个“CHROM”的第一个窗口等于 0,它就可以正常工作,但情况并非总是如此。在后一种情况下,循环将是无限的。例如,输入和 DESIRED 输出如下所示:
输入:
indiv window coverage SNP
BABA_1 20000 10000 7
BABA_1 30000 10000 1
BABA_1 50000 10000 2
BABA_1 60000 10000 3
BABA_1 80000 10000 1
BABA_10 20000 10000 1
BABA_10 30000 10000 16
BABA_10 80000 10000 9
期望的输出:
indiv window coverage SNP
BABA_1 10000 10000 0
BABA_1 20000 10000 7
BABA_1 30000 10000 1
BABA_1 40000 10000 0
BABA_1 50000 10000 2
BABA_1 60000 10000 3
BABA_1 70000 10000 0
BABA_1 80000 10000 1
BABA_10 10000 10000 0
BABA_10 20000 10000 1
BABA_10 30000 10000 16
BABA_10 40000 10000 0
BABA_10 50000 10000 0
BABA_10 60000 10000 0
BABA_10 70000 10000 0
BABA_10 80000 10000 9
我一直在努力寻找答案,以使我的这个 while 循环工作而不会无限进行,但我真的看不出我的缺陷。有没有人告诉我如何解决这个问题?
非常感谢任何帮助,在此先感谢!
【问题讨论】:
-
基本上,如果“CHROM”不等于0,你想退出
while循环,对吗? -
不,CHROM 实际上只是一个字符串,一旦字符串发生变化,我想重新开始
-
您需要注意区分大小写。
num1和NUM1不一样。 -
让我直说。你的窗口线都是从 10000 到 80000 的 10000 串,对吧?并且这些集合的数量等于不同 BABA_* 的数量
-
嗨,我意识到这实际上是我这样做的原因,因为大写字母变量只有在缺少一行时才会出现,我会自己添加。 NUM2 将是新窗口,在这种情况下 NUM4 将始终为 0,其余相同。
标签: python loops while-loop infinite