查找给定排列的索引答案

【问题标题】：Finding the index of a given permutation查找给定排列的索引
【发布时间】：2012-12-10 09:43:30
【问题描述】：

我正在以某种顺序一一阅读数字0, 1, ..., (N - 1)。我的目标是仅使用 O(1) 空间来找到这个给定排列的词典索引。

之前有人问过这个问题，但我能找到的所有算法都使用了O(N) 空间。我开始认为这是不可能的。但它确实对我减少分配数量有很大帮助。

【问题讨论】：

lexicography index of this given permutation是什么意思？
你知道的O(n)算法有哪些？您确定它们不适合或不易于修改为适合吗？
排列 p 的索引定义为所有下标 j 的总和，使得 p_j>p_(j+1)，对于 1
@Mugen 我已经添加了对步骤的解释，并添加了一个伪代码来帮助处理该过程。
@j_random_hacker，我研究了 OP 的用法，显然这是可以接受的用法。我实际上在 stackoverflow 之外找到了一个实现：geekviewpoint.com/java/numbers/permutation_index.

标签： algorithm permutation space combinatorics

【解决方案1】：

考虑以下数据：

chars = [a, b, c, d]
perm = [c, d, a, b]
ids = get_indexes(perm, chars) = [2, 3, 0, 1]

重复排列的可能解决方案如下：

len = length(perm)         (len = 4)
num_chars = length(chars)  (len = 4)

base = num_chars ^ len     (base = 4 ^ 4 = 256)
base = base / len          (base = 256 / 4 = 64)

id = base * ids[0]         (id = 64 * 2 = 128)
base = base / len          (base = 64 / 4 = 16)

id = id + (base * ids[1])  (id = 128 + (16 * 3) = 176)
base = base / len          (base = 16 / 4 = 4)

id = id + (base * ids[2])  (id = 176 + (4 * 0) = 176)
base = base / len          (base = 4 / 4 = 1)

id = id + (base * ids[3])  (id = 176 + (1 * 1) = 177)

逆过程：

id = 177
(id / (4 ^ 3)) % 4 = (177 / 64) % 4 =   2 % 4 = 2 -> chars[2] -> c
(id / (4 ^ 2)) % 4 = (177 / 16) % 4 =  11 % 4 = 3 -> chars[3] -> d
(id / (4 ^ 1)) % 4 = (177 / 4)  % 4 =  44 % 4 = 0 -> chars[0] -> a
(id / (4 ^ 0)) % 4 = (177 / 1)  % 4 = 177 % 4 = 1 -> chars[1] -> b

可能的排列数由num_chars ^ num_perm_digits 给出，其中num_chars 是可能的字符数，num_perm_digits 是排列中的位数。

这需要空间中的O(1)，将初始列表视为恒定成本；并且它需要O(N) 及时，考虑到N 作为您的排列将具有的位数。

根据以上步骤，您可以：

function identify_permutation(perm, chars) {

    for (i = 0; i < length(perm); i++) {
        ids[i] = get_index(perm[i], chars);
    }

    len = length(perm);
    num_chars = length(chars);

    index = 0;
    base = num_chars ^ len - 1;
    base = base / len;
    for (i = 0; i < length(perm); i++) {
        index += base * ids[i];
        base = base / len;
    }

}

这是一个伪代码，但它也很容易转换为任何语言（：

【讨论】：

我认为他的意思是不使用额外空间
@Mugen 请重新考虑我提出的解决方案；这适用于有重复的排列，所以像aaaa, aaab, aaac, ... 这样的排列都在考虑之中。我会尝试在不重复的情况下对排列进行变化。
这个问题怎么样？你能把答案转换成伪代码吗？或任何类型的代码math.stackexchange.com/questions/4346937/…

【解决方案2】：

有 N 个！排列。要表示索引，您至少需要 N 位。

【讨论】：

没有。您至少需要ceil(log2(N!)) 位。

【解决方案3】：

如果您想假设算术运算是常数时间，这里有一种方法：

def permutationIndex(numbers):
  n=len(numbers)
  result=0
  j=0
  while j<n:
    # Determine factor, which is the number of possible permutations of
    # the remaining digits.
    i=1
    factor=1
    while i<n-j:
      factor*=i
      i+=1
    i=0
    # Determine index, which is how many previous digits there were at
    # the current position.
    index=numbers[j]
    while i<j:
      # Only the digits that weren't used so far are valid choices, so
      # the index gets reduced if the number at the current position
      # is greater than one of the previous digits.
      if numbers[i]<numbers[j]:
        index-=1
      i+=1
    # Update the result.
    result+=index*factor
    j+=1
  return result

我特意写出了某些计算，这些计算可以使用一些 Python 内置操作更简单地完成，但我想让它更明显的是没有使用额外的非常量空间。

正如 maxim1000 所指出的，表示结果所需的位数将随着 n 的增加而迅速增长，因此最终将需要大整数，它不再具有恒定时间算术，但我认为这段代码符合您的精神问题。

【讨论】：

【解决方案4】：

如果您正在寻找一种方法来获取字典索引或唯一组合的排名而不是排列，那么您的问题属于二项式系数。二项式系数处理在总共有 N 个项目的 K 组中选择唯一组合的问题。

我用 C# 编写了一个类来处理处理二项式系数的常用函数。它执行以下任务：

以适合任何 N 选择 K 的格式将所有 K 索引输出到文件。 K-indexes 可以替换为更具描述性的字符串或字母。
将 K 索引转换为正确的词典索引或排序二项式系数表中条目的等级。这种技术比依赖迭代的旧已发布技术快得多。它通过使用帕斯卡三角形中固有的数学属性来做到这一点，并且与迭代集合相比非常有效。
将已排序二项式系数表中的索引转换为相应的 K 索引。我相信它也比旧的迭代解决方案更快。
使用Mark Dominus 方法计算二项式系数，该方法不太可能溢出并且适用于较大的数字。
该类是用 .NET C# 编写的，并提供了一种通过使用通用列表来管理与问题相关的对象（如果有）的方法。此类的构造函数采用一个名为 InitTable 的 bool 值，当它为 true 时，将创建一个通用列表来保存要管理的对象。如果此值为 false，则不会创建表。无需创建表即可使用上述 4 种方法。提供访问器方法来访问表。
有一个关联的测试类显示如何使用该类及其方法。它已经过 2 个案例的广泛测试，没有已知的错误。

要了解该课程并下载代码，请参阅Tablizing The Binomial Coeffieicent。

以下测试代码将遍历每个独特的组合：

public void Test10Choose5()
{
   String S;
   int Loop;
   int N = 10;  // Total number of elements in the set.
   int K = 5;  // Total number of elements in each group.
   // Create the bin coeff object required to get all
   // the combos for this N choose K combination.
   BinCoeff<int> BC = new BinCoeff<int>(N, K, false);
   int NumCombos = BinCoeff<int>.GetBinCoeff(N, K);
   // The Kindexes array specifies the indexes for a lexigraphic element.
   int[] KIndexes = new int[K];
   StringBuilder SB = new StringBuilder();
   // Loop thru all the combinations for this N choose K case.
   for (int Combo = 0; Combo < NumCombos; Combo++)
   {
      // Get the k-indexes for this combination.  
      BC.GetKIndexes(Combo, KIndexes);
      // Verify that the Kindexes returned can be used to retrive the
      // rank or lexigraphic order of the KIndexes in the table.
      int Val = BC.GetIndex(true, KIndexes);
      if (Val != Combo)
      {
         S = "Val of " + Val.ToString() + " != Combo Value of " + Combo.ToString();
         Console.WriteLine(S);
      }
      SB.Remove(0, SB.Length);
      for (Loop = 0; Loop < K; Loop++)
      {
         SB.Append(KIndexes[Loop].ToString());
         if (Loop < K - 1)
            SB.Append(" ");
      }
      S = "KIndexes = " + SB.ToString();
      Console.WriteLine(S);
   }
}

您应该能够相当轻松地将此类移植到您选择的语言。您可能不必移植类的通用部分来实现您的目标。根据您使用的组合数量，您可能需要使用大于 4 字节整数的字长。

【讨论】：

这个问题可以用你的方法吗。我在这里先向您的帮助表示感谢。 math.stackexchange.com/questions/4346937/…
@Eftekhari 看起来这个问题是关于排列，而不是组合。组合基于二项式定理。不同之处在于组合不允许重复，但这对于排列是可以的。例如，组合可能是 (2, 1, 0)。而排列可能是 (1, 1, 1)。所以，我的班级不适合排列。但是，我不明白为什么不能使用一个简单的公式来计算任何给定排列的等级，其中维度的数量和它们各自的长度是已知的。
你能想出这个问题的答案吗？使用任何编程语言。

【解决方案5】：

geekviewpoint 上有解决此问题的 java 解决方案。它很好地解释了为什么它是正确的，并且代码很容易理解。 http://www.geekviewpoint.com/java/numbers/permutation_index。它还有一个单元测试，可以使用不同的输入运行代码。

【讨论】：

正是我想要的！但是，没有描述反向操作（从索引中检索排列）。

【解决方案6】：

这个想法没有什么真正的新东西，而是一种完全矩阵化的方法，没有显式循环或递归（使用 Numpy，但易于适应）：

import numpy as np
import math
vfact = np.vectorize(math.factorial, otypes='O')

def perm_index(p):
    return np.dot( vfact(range(len(p)-1, -1, -1)),
                   p-np.sum(np.triu(p>np.vstack(p)), axis=0) )

【讨论】：

【解决方案7】：

我刚刚使用 Visual Basic 编写了一个代码，我的程序可以直接计算每个索引或给定索引的每个对应排列，最多 17 个元素（这个限制是由于数字的科学计数法的近似值超过 17！编译器）。

如果您有兴趣，我可以将程序发送或发布到某个地方以供下载。它工作正常，可用于测试和示范代码输出。

我使用了 James D. McCaffrey 的称为 factoradic 的方法，您可以阅读有关它的内容 here 和 here（在页面末尾的讨论中）。

【讨论】：

这里是free program页面的链接