【发布时间】:2014-07-06 22:38:06
【问题描述】:
我正在编写一个程序来使用英语中的字母频率来破译代码。例如,在英语中,e 是最常用的字母,它的使用率为 13%。在密码中,大约 13% 的时间也使用 a,因此 a 很可能对应于 e。然而,每个单词的首字母也有不同的频率。例如,字母“T”有 16.6% 的时间用作单词的首字母,对应于密文中的 q。这是一个链接到更多关于它http://en.wikipedia.org/wiki/Letter_frequency
我使用的基本伪代码是,
- 将密码放入字符串并删除大写字母
- 遍历文本,计算每个字母出现的时间
- 在文本中循环计算单词的每个首字母
- 创建一个类来保存字母值和该字母出现的次数
- 创建一个该类的数组,每个字母都有一个数组
- 从高到低对数组进行排序
- 用对应的字母替换首字母
- 用对应的字母替换其他字母
- 应该有可读的消息
我已经开始更换字母阶段了。我知道我可以使用替换来更改字母,但是我的问题是如何仅将每个单词的第一个字母更改为对应的字母以及替换剩余的字母,因为两者的频率不同,具体取决于它是否是第一个字母。
任何帮助都会很明显。
public class CodeCracker
{
public CodeCracker() throws FileNotFoundException
{
File cipher = new File("enciphered.txt");
Scanner cipherInput = new Scanner(cipher);
String cipherText = "";
//add file to string
while(cipherInput.hasNextLine()){
cipherText += cipherInput.nextLine();
}
cipherText = cipherText.toLowerCase();
//count total letters
System.out.println("~~~Original Message~~~");
System.out.println(cipherText);
//****Count letter Occurences****
List<LetterOccurence> allLettersPercentage = CountLetterPercentage(cipherText);
List<LetterOccurence> firstLetterPercentage = CountFirstLetterPercentage(cipherText);
for(int i = 0; i <firstLetterPercentage.size(); i++){
//System.out.println(allLettersPercentage.get(i).GetLetter() + " - " + allLettersPercentage.get(i).GetOccurence());
System.out.println(firstLetterPercentage.get(i).GetLetter() + " - " + firstLetterPercentage.get(i).GetOccurence());
}
System.out.println("~~~New Message~~~");
System.out.println(CompareAllLetters(allLettersPercentage,cipherText));
}
//Counts occurence of each letter in text, makes a new object with the assigned letter and the percentage
// of the letter occuring, then sorts them from highest to lowest and returns a list array
public List<LetterOccurence> CountLetterPercentage(String text){
double totalLetters = 0; //total letters in text
totalLetters = text.length();
String indexes = "abcdefghijklmnopqrstuvwxyz"; //letters we are counting
int[] count = new int [indexes.length()]; //array of ints for each letter
double[] letterPercentage = new double[indexes.length()]; //Percent of number of times the letter appears
List<LetterOccurence> letterOccurences = new ArrayList<LetterOccurence>(indexes.length()); //list of LetterOccurence class
//iterates through each letter and counts each occurence
for(int i = 0; i < text.length(); i++){
int index = indexes.indexOf(text.charAt(i));
if (index < 0)
continue;
count[index]++; //count letter
}
//calculates letter percentages
for( int i = 0; i < count.length; i++){
if(count[i] < 1){
continue;
}
//get percentage
letterPercentage[i] = count[i] /totalLetters * 100;
//create a class to store variables
letterOccurences.add(new LetterOccurence(indexes.charAt(i),letterPercentage[i]));
}
//sort our array from highest to lowest
Collections.sort(letterOccurences);
//return our array
return letterOccurences;
}
public List<LetterOccurence> CountFirstLetterPercentage(String text){
double totalLetters = 0; //total letters in text
String indexes = "abcdefghijklmnopqrstuvwxyz"; //letters we are counting
int[] count = new int [indexes.length()]; //array of ints for each letter
double[] letterPercentage = new double[indexes.length()]; //Percent of number of times the letter appears
List<LetterOccurence> letterOccurences = new ArrayList<LetterOccurence>(indexes.length()); //list of LetterOccurence class
String firstLetters ="";
String[] split = text.split(" ");
for(String value : split){
firstLetters += value.substring(0,1);
}
totalLetters = firstLetters.length();
System.out.println(firstLetters);
//iterates through each letter and counts each occurence
for(int i = 0; i < firstLetters.length(); i++){
int index = indexes.indexOf(firstLetters.charAt(i));
if (index < 0)
continue;
count[index]++; //count letter
}
//calculates letter percentages
for( int i = 0; i < count.length; i++){
if(count[i] < 1){
continue;
}
//get percentage
letterPercentage[i] = count[i] /totalLetters * 100;
//create a class to store variables
letterOccurences.add(new LetterOccurence(indexes.charAt(i),letterPercentage[i]));
}
//sort our array from highest to lowest
Collections.sort(letterOccurences);
//return our array
return letterOccurences;
}
public String CompareAllLetters(List<LetterOccurence> codeLetters, String code){
//Letter Frequency order
char[] letterFrequency = {'e','t','a','o','i','n','s','h','r','d','l','c','u','m','w','f','g','y','p','b','v','k','j','x','q','z'};
for(int i = 0;i < codeLetters.size(); i++){
code = code.replace(codeLetters.get(i).GetLetter(),letterFrequency[i]);
}
return code;
}
}
public class LetterOccurence implements Comparable<LetterOccurence>{
private char letter;
private double occurence;
public LetterOccurence(char letter, double occurence){
this.letter = letter;
this.occurence = occurence;
}
public double GetOccurence(){
return occurence;
}
public char GetLetter(){
return letter;
}
public int compareTo(LetterOccurence o){
// return (occurence - o.occurence);
return new Double(o.occurence).compareTo(new Double(this.occurence));
}
}
【问题讨论】:
-
空格会不会比 e 和 a 出现得更多?
-
是的,空格确实比 e 和 a 出现的频率更高,但是密文已经包含空格。
标签: java string replace split substring