【问题标题】:Dissecting mergesort routine剖析归并排序例程
【发布时间】:2012-10-04 18:43:26
【问题描述】:

免责声明:这是一个需要对代码和算法进行一些解释的问题。它不是为了修复任何东西或优化任何东西,而是为了便于理解。

我对排序例程的理解不是很好。我在此处寻求帮助,将合并排序的现有代码从integer type 转换为string typedelphi mergesort for string arrays。收到答案后,我开始了解排序程序。

一些资源可以帮助理解:

  1. http://www.iti.fh-flensburg.de/lang/algorithmen/sortieren/merge/mergen.htm
  2. http://www.youtube.com/watch?v=9Qk1t66g7IU

我试图剖析代码以跟随它。这个问题不是我试图验证我自己对归并排序的理解,而是以清晰的方式展示排序例程。这个问题的价值在于人们试图更好地理解归并排序。这是必不可少的,因为如果您很好地理解了一个原型,则可以更容易地理解其他类型。

我的问题是为什么我们要添加“1”来设置长度和“结果”

SetLength(AVals, Length(Vals) div 2 + 1);
Result := 1 + PerformMergeSort(0, High(Vals));

为什么我们在这里减去“1”? 编辑:如果不减去 1,我认为 K 会超出范围?

Result := k - 1;

这是这个问题的代码;顺便说一句,这是一个优化的归并排序,因为它只复制了一半的数组:

function MergeSortRemoveDuplicates(var Vals: array of Integer):Integer;
var
  AVals: array of Integer;

   //returns index of the last valid element
  function Merge(I0, I1, J0, J1: Integer):Integer;
  var
    i, j, k, LC:Integer;
  begin
    LC := I1 - I0;
    for i := 0 to LC do
      AVals[i]:=Vals[i + I0];
      //copy lower half or Vals into temporary array AVals

    k := I0;
    i := 0;
    j := J0;
    while ((i <= LC) and (j <= J1)) do
    if (AVals[i] < Vals[j]) then begin
      Vals[k] := AVals[i];
      inc(i);
      inc(k);
    end else  if (AVals[i] > Vals[j]) then begin
      Vals[k]:=Vals[j];
      inc(k);
      inc(j);
    end else begin //duplicate
      Vals[k] := AVals[i];
      inc(i);
      inc(j);
      inc(k);
    end;

    //copy the rest
    while i <= LC do begin
      Vals[k] := AVals[i];
      inc(i);
      inc(k);
    end;

    if k <> j then
      while j <= J1 do begin
        Vals[k]:=Vals[j];
        inc(k);
        inc(j);
      end;

    Result := k - 1;
  end;

 //returns index of the last valid element

  function PerformMergeSort(ALo, AHi:Integer): Integer; //returns
  var
    AMid, I1, J1:Integer;
  begin

  //It would be wise to use Insertion Sort when (AHi - ALo) is small (about 32-100)
    if (ALo < AHi) then
    begin
      AMid:=(ALo + AHi) shr 1;
      I1 := PerformMergeSort(ALo, AMid);
      J1 := PerformMergeSort(AMid + 1, AHi);
      Result := Merge(ALo, I1, AMid + 1, J1);
    end else
      Result := ALo;
  end;

begin
  SetLength(AVals, Length(Vals) div 2 + 1);
  Result := 1 + PerformMergeSort(0, High(Vals));
end;

这是我的理解,只做了很小的修改:

function MergeSortRemoveDuplicates(var Vals: array of Integer):Integer;
var
  AVals: array of Integer;

   //returns index of the last valid element
  function Merge(I0, I1, J0, J1: Integer):Integer;
  var
    i, j, k, LC:Integer;
  begin
    // difference between mid-point on leftside
    // between low(Original_array) and midpoint(true Original_array midpoint)
    // subtracting I0 which is Low(Original_array)
    // or here equals zero(0)
    // so LC is quarter point in Original_array??
    LC := I1 - I0;

    // here we walk from begining of array
    // and copy the elements between zero and LC
    // this is funny call that Vals[i + I0] like 0 + 0
    // then 1 + 0 and so on. I guess this guarantees if we are
    // starting from non-zero based array??
    for i := 0 to LC do
      AVals[i]:=Vals[i + I0];


    // k equal low(Original_array)
    k := I0;

    // I will be our zero based counter element
    i := 0;

    // J will be (midpoint + 1) or
    // begining element of right side of array
    j := J0;

    // while we look at Copy_array elements
    // between first element (low(Copy_array)
    // and original_array from midpoint + 1 to high(Original_array)
    // we start to sort it
    while ((i <= LC) and (j <= J1)) do

    // if the value at Copy_array is smaller than the Original_array
    // we move it to begining of Original_array
    // remember position K is first element
    if (AVals[i] < Vals[j]) then begin
      Vals[k] := AVals[i];

      // move to next element in Copy_array
      inc(i);

      // move to next element in Original_array
      inc(k);

    // if the value at copy_array is larger
    // then we move smaller value from J Original_array (J is midpoint+1)
    // to position K original_array (K now is the lower part of ) Original_array)
    end else  if (AVals[i] > Vals[j]) then begin
      Vals[k]:=Vals[j];

      //move K to the next element in Original_array
      inc(k);

      // move j to next element in Original_array
      inc(j);

    // if the value in Original_array is equal to the element in Copy_array
    // do nothing and count everything up
    // so we end up with one copy from duplicate and disregard the rest
    end else begin //duplicate
      Vals[k] := AVals[i];
      inc(i);
      inc(j);
      inc(k);
    end;

    //copy the rest
    while i <= LC do begin
      Vals[k] := AVals[i];
      inc(i);
      inc(k);
    end;

    // if the counters do not endup at the same element
    // this means we have some that maybe leftover on
    // the right side of the Original_array.
    // This explains why K does not equal J : there are still elements left over
    // then copy them to Original_array
    // starting at position K.
    if k <> j then
      while j <= J1 do begin
        Vals[k]:=Vals[j];
        inc(k);
        inc(j);
      end;

    // why K - 1?
    // function needs result so return will be null if called
    // I don't understand this part
    Result := k - 1;
  end;

 //returns index of the last valid element

  function PerformMergeSort(ALo, AHi:Integer): Integer; //returns
  var
    AMid, I1, J1:Integer;
  begin

  //It would be wise to use Insertion Sort when (AHi - ALo) is small (about 32-100)
    if (ALo < AHi) then
    begin
      AMid:=(ALo + AHi) shr 1;    // midpoint
      I1 := PerformMergeSort(ALo, AMid);  //recursive call I1 is a data point on the left
      J1 := PerformMergeSort(AMid + 1, AHi);  // recursive call I1 is a data point on the right
      Result := Merge(ALo, I1, AMid + 1, J1);
    end else
      Result := ALo;
  end;

begin
  // test if array is even then we can split nicely down middle
  if Length(Vals) mod 2 = 0 then
  begin
    SetLength(AVals, Length(Vals) shr 1);
    Result := PerformMergeSort(0, High(Vals));
  end
  else
  //array is odd let us add 1 to it and make it even
  // shr 1 is essentially dividing by 2 but doing it on the bit level
  begin
    SetLength(AVals, (Length(Vals) + 1) shr 1);
    Result := PerformMergeSort(0, High(Vals));
  end;
end;

【问题讨论】:

  • 你可能会喜欢this合并排序的舞蹈示例:)
  • 很好,我喜欢他们使用 10,因为它在第二次拆分时最终为 2 + 3。 :)
  • @LURD,不错的合并排序示例,但是我无法想象生产环境中的这种速度:-)
  • 这个视频可能是helpful,但从@3:10 开始,因为它有不止一种排序算法。

标签: algorithm delphi sorting computer-science mergesort


【解决方案1】:

这是我对作者提供的代码的修改,旨在在排序期间删除重复项。一些解释:

外部函数:

  1. 我们应该提供缓冲区 (AVals) 来存储一半的初始数组。 Length(Vals) div 2 + 1 为奇数和偶数大小的数组提供了足够的空间,而没有不必要的复杂性。更好的价值(适用于所有情况):Length(Vals + 1) div 2
  2. 内部过程 PerformMergeSort 返回最后一个有效元素的 index,但外部过程返回 count 个有效元素(已在引用的主题中注释),所以我使用 ( 1 + 执行合并排序())。

    原因:内部我们必须使用索引,但此过程的最终用户应该知道新的数组长度。

内部函数PerformMergeSort:

它获取数组块的开始和结束索引,对该块进行排序并返回最后一个有效元素的索引。在递归调用之后,我们就有了这种情况。 不变性:两个块都已排序,它们不包含重复项,左段长度非零

*****ACDEFG****BCDEGHILM******
     ^    ^    ^       ^
     |    |    |       |
     Alo  I1   AMid+1  J1 
     I0   I1   J0      J1  //as named in Merge 
     \____/
       LC+1 elements 

合并后:

*****ABCDEFGHILM**************
     ^         ^^
     |         ||__k
     |         | 
     Alo       Result       

内部函数合并:

使用提供的示例,笔和纸,逐步完成合并,看看它是如何工作的。

关于复制周期:我们将(LC+1)个元素复制到临时缓冲区 AVals,使用 AVals 的起始段(总是从 0 开始)和主数组的正确段(从 I0 开始,通常不为零) )

【讨论】:

  • 感谢您的解释。这是什么原因:for i := 0 to LC do AVals[i]:=Vals[i + I0];我的意思是 I0 部分?因为它总是在基于零的数组上添加一个零?
  • 试着解释一下,看看加法。