【发布时间】:2015-12-25 00:28:15
【问题描述】:
我的问题的前半部分:当我尝试运行我的程序时,它会永远加载并加载;它从不显示结果。有人可以检查我的代码并在某处发现错误。该程序旨在找到一个起始 DNA 密码子 ATG 并不断寻找,直到找到一个终止密码子 TAA 或 TAG 或 TGA,然后从头到尾打印出基因。我正在使用 BlueJ。
我的问题的后半部分:我应该编写一个需要执行以下步骤的程序:
To find the first gene, find the start codon ATG.
Next look immediately past ATG for the first occurrence of each of the three stop codons TAG, TGA, and TAA.
If the length of the substring between ATG and any of these three stop codons is a multiple of three, then a candidate for a gene is the start codon through the end of the stop codon.
If there is more than one valid candidate, the smallest such string is the gene. The gene includes the start and stop codon.
If no start codon was found, then you are done.
If a start codon was found, but no gene was found, then start searching for another gene via the next occurrence of a start codon starting immediately after the start codon that didn't yield a gene.
If a gene was found, then start searching for the next gene immediately after this found gene.
请注意,根据此算法,对于字符串“ATGCTGACCTGATAG”,ATGCTGACCTGATAG 可能是一个基因,但 ATGCTGACCTGA 不是,即使它更短,因为首先找到另一个不是倍数的 'TGA' 实例三个远离起始密码子。
在我的作业中,我也被要求提供这些方法:
具体而言,要实现该算法,您应该执行以下操作。
Write the method findStopIndex that has two parameters dna and index, where dna is a String of DNA and index is a position in the string. This method finds the first occurrence of each stop codon to the right of index. From those stop codons that are a multiple of three from index, it returns the smallest index position. It should return -1 if no stop codon was found and there is no such position. This method was discussed in one of the videos.
Write the void method printAll that has one parameter dna, a String of DNA. This method should print all the genes it finds in DNA. This method should repeatedly look for a gene, and if it finds one, print it and then look for another gene. This method should call findStopIndex. This method was also discussed in one of the videos.
Write the void method testFinder that will use the two small DNA example strings shown below. For each string, it should print the string, and then print the genes found in the string. Here is sample output that includes the two DNA strings:
示例输出为:
ATGAAAATGAAAA
找到的基因是:
ATGAAAATGA
DNA 字符串是:
ccatgccctaataaatgtctgtaatgtaga
发现的基因是:
atgccctaa
atgtctgtaatgtag
DNA 字符串是:
CATGTAATAGATGAATGACTGATAGATATGCTTGTATGCTATGAAAATGTGAAATGACCCA
发现的基因是:
ATGTAA
ATGAATGACTGATAG
ATGCTATGA
ATGTGA
我已经考虑过了,发现这段代码接近正常工作状态。我只需要让我的输出产生说明中要求的结果。希望这不会太混乱,我只是不知道如何在起始密码子之后寻找终止密码子,然后如何获取基因序列。我也希望了解如何通过查找三个标签(tag、tga、taa)中的哪个更接近 atg 来获得最接近的基因序列。我知道这很多,但希望这一切都有意义。
import edu.duke.*;
import java.io.*;
public class FindMultiGenes {
public String findGenes(String dnaOri) {
String gene = new String();
String dna = dnaOri.toLowerCase();
int start = -1;
while(true){
start = dna.indexOf("atg", start);
if (start == -1) {
break;
}
int stop = findStopCodon(dna, start);
if(stop > start){
String currGene = dnaOri.substring(start, stop+3);
System.out.println("From: " + start + " to " + stop + "Gene: "
+currGene);}
}
return gene;
}
private int findStopCodon(String dna, int start){
for(int i = start + 3; i<dna.length()-3; i += 3){
String currFrameString = dna.substring(i, i+3);
if(currFrameString.equals("TAG")){
return i;
} else if( currFrameString.equals("TGA")){
return i;
} else if( currFrameString.equals("TAA")){
return i;
}
}
return -1;
}
public void testing(){
FindMultiGenes FMG = new FindMultiGenes();
String dna =
"CATGTAATAGATGAATGACTGATAGATATGCTTGTATGCTATGAAAATGTGAAATGACCCA";
FMG.findGenes(dna);
System.out.println("DNA string is: " + dna);
}
}
【问题讨论】:
-
你的
while(true)循环永远不会结束。 -
我该如何解决?我写了一个 break 语句,但它似乎无助于循环结束。
-
我认为
findStopCodon中的字符串常量必须是小写的。 -
您可能对正则表达式解决方案感兴趣,该解决方案将在几行内完成所有这些。
标签: java dna-sequence