列出所有唯一数字排列的算法包含重复项答案

【问题标题】：Algorithm to list all unique permutations of numbers contain duplicates列出所有唯一数字排列的算法包含重复项
【发布时间】：2012-07-11 03:21:11
【问题描述】：

问题是：给定一组可能包含重复的数字，返回所有唯一排列。

天真的方法是使用集合（在 C++ 中）来保存排列。这需要 O(n! × log(n!)) 时间。有没有更好的解决方案？

【问题讨论】：

由于n! 有n 不同整数的排列，如果你需要枚举它们，你不能比O(n!) 做得更好。另请注意，重复项的存在是无关紧要的，因为与枚举排列相比，删除重复项的过程花费的时间可以忽略不计。
@veredesmarald。是的，我正在尝试将时间复杂度降低到 O(n!)。
1. next_permutation（在 C++ STL 中）只访问每个排列一次，即使存在重复也是如此。 2. 单独的空间要求是 O(nn!)，而不是 O(n!)。 3.插入所有n！ STL 集中的排列需要 O(n!log(n!)) = O(nn!*logn)
@bloops 我相信练习的重点是实现next_permutation。此外，我也许应该证明我只是在谈论时间复杂度，我只会将它们存储在一个列表中（因为下一个 perm 算法已经排除了重复项）。
@bloops。 1. Next_permutation 使问题变得不那么有趣。 2. 你说的没错，使用 set 的整体时间复杂度是 O(n!lg(n!))。

标签： algorithm permutation combinations

【解决方案1】：

最简单的方法如下：

排序列表：O(n lg n)
排序后的列表是第一个排列
从前一个排列重复生成“下一个”排列：O(n! * <complexity of finding next permutaion>)

步骤 3 可以通过将下一个排列定义为如果排列列表已排序则将直接出现在当前排列之后的排列来完成，例如：

1, 2, 2, 3
1, 2, 3, 2
1, 3, 2, 2
2, 1, 2, 3
2, 1, 3, 2
2, 2, 1, 3
...

找到下一个词典排列是 O(n)，在维基百科页面上的标题 Generation in lexicographic order 下给出了简单的排列说明。如果您有野心，您可以使用plain changes在 O(1) 中生成下一个排列

【讨论】：

更新：我最初误读了这个问题，并认为在计算排列之前应该丢弃重复项。
next_permutation 是重点。而且夏天没有作业：）
@zwx 我住的地方是冬天。 :) 我添加了一些指向下一个烫发算法的链接，如果您遇到任何具体困难，我很乐意提供帮助，但否则我认为在这里重新输入算法没有任何好处。
“常规”next_permutation 算法的复杂度为 O(1)，在所有排列中摊销。因此，如果您要访问所有排列，那是最佳选择。
@bloops 看来你是对的。我不知道，谢谢！

【解决方案2】：

1) 回溯/递归搜索的一些变化通常可以解决这类问题。给定一个返回 (n-1) 个对象的所有排列列表的函数，生成一个包含 n 个对象的所有排列的列表，如下所示：对于列表中的每个元素，在所有可能的位置插入第 n 个对象，检查重复项。这不是特别有效，但它通常会为此类问题生成简单的代码。

2) 参见维基百科http://en.wikipedia.org/wiki/Permutation#Generation_in_lexicographic_order

3) 学术界在这方面花费了大量时间。请参阅 Knuth Vol 4A 的第 7.2.1.2 节 - 这是一本大型精装书，在 Amazon 上有以下简要目录：

第 7 章：组合搜索 1

7.1：0 和 1 47

7.2：产生所有可能性 281

【讨论】：

【解决方案3】：

您应该阅读 my blog post 关于这种排列（以及其他内容）以获得更多背景信息 - 并点击那里的一些链接。

这是我的词典排列生成器的一个版本，它按照 Steinhaus-Johnson-Trotter 排列生成器的生成序列按要求完成：

def l_perm3(items):
    '''Generator yielding Lexicographic permutations of a list of items'''
    if not items:
        yield []
    else:
        dir = 1
        new_items = []
        this = [items.pop()]
        for item in l_perm3(items):
            lenitem = len(item)
            try:
                # Never insert 'this' above any other 'this' in the item 
                maxinsert = item.index(this[0])
            except ValueError:
                maxinsert = lenitem
            if dir == 1:
                # step down
                for new_item in [item[:i] + this + item[i:] 
                                 for i in range(lenitem, -1, -1)
                                 if i <= maxinsert]:
                    yield new_item                    
            else:    
                # step up
                for new_item in [item[:i] + this + item[i:] 
                                 for i in range(lenitem + 1)
                                 if i <= maxinsert]:
                    yield new_item                    
            dir *= -1

from math import factorial
def l_perm_length(items):
    '''\
    Returns the len of sequence of lexicographic perms of items. 
    Each item of items must itself be hashable'''
    counts = [items.count(item) for item in set(items)]
    ans = factorial(len(items))
    for c in counts:
        ans /= factorial(c)
    return ans

if __name__ == '__main__':
    n = [0, 1, 2, 2, 2]
    print '\nLexicograpic Permutations of %i items: %r' % (len(n), n)
    for i, x in enumerate(l_perm3(n[:])):
        print('%3i %r' % (i, x))
    assert i+1 == l_perm_length(n), 'Generated number of permutations is wrong'

上述程序的输出例如如下：

Lexicograpic Permutations of 5 items: [0, 1, 2, 2, 2]
  0 [0, 1, 2, 2, 2]
  1 [0, 2, 1, 2, 2]
  2 [2, 0, 1, 2, 2]
  3 [2, 0, 2, 1, 2]
  4 [0, 2, 2, 1, 2]
  5 [2, 2, 0, 1, 2]
  6 [2, 2, 0, 2, 1]
  7 [0, 2, 2, 2, 1]
  8 [2, 0, 2, 2, 1]
  9 [2, 2, 2, 0, 1]
 10 [2, 2, 2, 1, 0]
 11 [2, 1, 2, 2, 0]
 12 [1, 2, 2, 2, 0]
 13 [2, 2, 1, 2, 0]
 14 [2, 2, 1, 0, 2]
 15 [1, 2, 2, 0, 2]
 16 [2, 1, 2, 0, 2]
 17 [2, 1, 0, 2, 2]
 18 [1, 2, 0, 2, 2]
 19 [1, 0, 2, 2, 2]

【讨论】：

【解决方案4】：

我在思考如何手动编写排列并将该方法放入代码中后发明的这个方法更短更好：

def incv(prefix,v):
  list = []
  done = {}
  if v:
    for x in xrange(len(v)):
      if v[x] not in done:
        done[v[x]] = 1
        list = list + incv(prefix+v[x:x+1],v[:x] + v[x+1:])
  else:
    list.append(''.join(prefix))
  return list

def test(test_string,lex_ord=False):
  if lex_ord:
    test_string = [x for x in test_string]
    test_string.sort()
  p = incv([],[x for x in test_string])
  if lex_ord:
    try_p = p[::]
    try_p.sort()
    print "Sort methods equal ?", try_p == p
  print 'All', ','.join(p), "\n", test_string, "gave", len(p), "permutations"

if __name__ == '__main__':
  import sys
  test(sys.argv[1],bool(sys.argv[2] if len(sys.argv) > 2 else False))

备注

incv 增加置换向量以找到所有这些。它还可以正确处理重复的字母。
test 打印出测试字符串的所有排列及其计数。它还确保如果您请求按字典顺序排序，则 sort before 和 sort after 方法是相同的。这应该是 True，因为原始字符串是有序的，并且增量置换函数将字符串转换为给定字母表的下一个字典字符串。

此脚本可以通过以下方式在命令提示符下运行：

python script.py [test_string] [optional anything to use lexicographic ordering]

【讨论】：

【解决方案5】：

我略微改进了Paddy3118's solution，所以它现在是非递归的、惰性求值的（完全基于生成器）并且速度提高了大约 30%。

def _handle_item(xs, d, t):
    l = len(xs)

    try:
        m = xs.index(t)
    except ValueError:
        m = l

    if d:
        g = range(l, -1, -1)
    else:
        g = range(l + 1)

    q = [t]
    for i in g:
        if i <= m:
            yield xs[:i] + q + xs[i:]

def _chain(xs, t):
    d = True

    for x in xs:
        yield from _handle_item(x, d, t)

        d = not d

def permutate(items):
    xs = [[]]

    for t in items:
        xs = _chain(xs, t)

    yield from xs

附：我注意到 Paddy3118 也让他的实现使用了生成器，而我一直在博客文章中反对实现，这更加内存密集。无论如何我都会发布这个，因为这个版本可能被认为更干净。

【讨论】：

【解决方案6】：

递归版本。这计算 n!/(m*k!) (m 个字符集，k 个重复字符集：

#include<iostream>
#include<cstring>

using namespace std;

const int MAX_CHARS_STRING=100;
int CURRENT_CHARS=0;
char STR[MAX_CHARS_STRING];

void myswap(int i, int j){
    char c=STR[i];STR[i]=STR[j];STR[j]=c;
}

bool IstobeExecuted(int start,int position){
    if(start==position)
        return true;
    for(int i=position-1;i>=start;i--){
        if(STR[i]==STR[position])
            return false;
    }
    return true;
}

void Permute(int start, int end,int& perm_no){
    if(end-start<=1){
        if(STR[end]==STR[start]){
            cout<<perm_no++<<") "<<STR<<endl;
            return;
        }
        cout<<perm_no++<<") "<<STR<<endl;
        myswap(start, end);
        cout<<perm_no++<<") "<<STR<<endl;
        myswap(start,end);
        return;
    }
    for(int i=start; i<=end;i++){
        if(!IstobeExecuted(start,i)){
            continue;
        }
        myswap(start,i);
        Permute(start+1,end,perm_no);
        myswap(start,i);
    }
}


int main(){
    cin>>STR;int num=1;
    Permute(0,strlen(STR)-1,num);
    return 0;
}

希望对你有帮助

【讨论】：

【解决方案7】：

@verdesmarald 解决方案的简单简短的 C++ 实现：

vector<vector<int>> permuteUnique(vector<int>& nums) {

    vector<vector<int>> res;
    const auto begin = nums.begin();
    const auto end = nums.end();
    std::sort(begin, end);

    do
    {
        res.push_back(nums);
    } 
    while (std::next_permutation(begin, end));

    return res;
}

我认为时间复杂度是：n*log(n) + m * ComplexityOf(next_permutation) 其中 n 是元素的总数，m 是唯一元素，next_permutation 的复杂度是 O(1) 摊销的。或者他们说：The amortized complexity of std::next_permutation?

【讨论】：