【发布时间】:2011-06-13 12:27:45
【问题描述】:
您好!
我从头开始编写了一个递归差异算法。它找到两个字符串之间的“最佳匹配”,以使差异最小化,并打印出两个字符串,其中任何差异都以 CAPS 表示。它“按原样”工作得很好,只是效率很低。我已经盯着它看了一天半,试图找到让它迭代的方法,或者至少减少它达到的堆栈深度,但我无计可施,希望这里有一个敏锐的头脑会比我更清楚地看到解决方案。
下面是代码的核心。引用的 MergePoint 类只是一个简单的“链表”样式节点,其中包含一个“原始索引”整数、一个“更改后的”整数索引和一个“下一个”MergePoint。 MergePoint 列表表示每个数组中已“合并”的一系列索引。当链完成时,链中未表示的任何索引都是插入/删除。 NullObject 对象是 MergePoint 的扩展,回想起来,创建它并不是绝对必要的,基本上可以被视为常规的“null”。
任何意见/建议将不胜感激。
public class StringCompare
{
public static int[][] mergeList = new int[0][0];
public static MergePoint NULL = NullObject.getNull();
public static int maxMerged = 0;
public static int minClusterSize = -1;
public static void diff(String orig, String alt)
{
String[] original = orig.toUpperCase().split(" ");
String[] altered = alt.toUpperCase().split(" ");
for(int i = 0; i < altered.length; i++)
{
merge(original, altered, 0, i, NULL, NULL, 0, 0);
}
for(int i = 0; i < mergeList.length; i++)
{
or[mergeList[i][0]] = or[mergeList[i][0]].toLowerCase();
al[mergeList[i][1]] = al[mergeList[i][1]].toLowerCase();
}
printStringArray(or);
printStringArray(al);
}
private void printStringArray(String[] arr)
{
for(String word : arr)
{
System.out.print(word.trim() + " ");
}
System.out.println();
}
private static void merge(String[] original, String[] altered, int indexInOriginal, int indexInAltered, MergePoint head, MergePoint tail, int listSize, int clusters)
{
if (indexInOriginal >= original.length)
{
if (listSize > 0)
{
if (((listSize == maxMerged) && (clusters < minClusterSize)) ||
(listSize > maxMerged))
{
storeMergePoints(head, listSize, clusters);
}
}
}
else if (indexInAltered >= altered.length)
{
if (tail != NULL)
{
merge(original, altered, (indexInOriginal + 1), (tail.indexInNew() + 1), head, tail, listSize, clusters);
}
else
{
merge(original, altered, (indexInOriginal + 1), 0, head, tail, listSize, 0);
}
}
else
{
if(original[indexInOriginal].equals(altered[indexInAltered]))
{
MergePoint mergePoint = new MergePoint(indexInOriginal, indexInAltered);
MergePoint bookMark = NULL;
int newClusters = clusters;
if (indexInOriginal != (tail.indexInOriginal() + 1))
{
newClusters++;
}
if (indexInAltered != (tail.indexInNew() + 1))
{
newClusters++;
}
if (head == NULL)
{
head = mergePoint;
tail = head;
}
else
{
tail.setNext(mergePoint);
bookMark = tail;
tail = tail.next();
}
merge(original, altered, (indexInOriginal + 1), (indexInAltered + 1), head, tail, (listSize + 1), newClusters);
if (bookMark == NULL)
{
merge(original, altered, indexInOriginal, (indexInAltered + 1), NULL, NULL, 0, 0);
}
else
{
bookMark.setNext(NULL);
merge(original, altered, indexInOriginal, (indexInAltered + 1), head, bookMark, listSize, newClusters);
}
}
else
{
merge(original, altered, indexInOriginal, (indexInAltered + 1), head, tail, listSize, clusters);
}
}
}
public static void storeMergePoints(MergePoint current, int size, int clusters)
{
mergeList = new int[size][2];
maxMerged = size;
minClusterSize = clusters;
for(int i = 0; i < size; i++)
{
mergeList[i][0] = current.indexInOriginal();
mergeList[i][1] = current.indexInNew();
current = current.next();
}
}
}
【问题讨论】:
-
我不相信简单地把它变成一个迭代解决方案就能解决你的性能问题。可能值得看看这个已知性能良好的现有算法:en.wikipedia.org/wiki/Longest_common_subsequence_problem
-
我花了一段时间仔细阅读,这看起来很有希望——谢谢!
标签: java algorithm recursion diff iteration