【问题标题】:Merged sorting with gaps带间隙的合并排序
【发布时间】:2013-10-01 13:49:33
【问题描述】:

假设我有以下排序列表:

a = ['8EF5CD1B', 'B1392DB3', '59770CD6', 'BD23F32A', '4EFA3222']
b = ['8EF5CD1B', '96276D30', 'B1392DB3', '59770CD6', '4EFA3222']
c = ['96276D30', 'B1392DB3', 'BD23F32A', '59770CD6']

我希望通过填补较低优先级列表中的空白来将它们合并排序。

>>> from itertools import permutations
>>> LISTS = (a, b, c)
>>> for (first, second) in permutations(LISTS, 2):
...     print((LISTS.index(first), LISTS.index(second)), magic(first, second))
...
(0, 1) ['8EF5CD1B', '96276D30', 'B1392DB3', '59770CD6', 'BD23F32A', '4EFA3222']
(0, 2) ['8EF5CD1B', '96276D30', 'B1392DB3', '59770CD6', 'BD23F32A', '4EFA3222']
(1, 0) ['8EF5CD1B', '96276D30', 'B1392DB3', '59770CD6', 'BD23F32A', '4EFA3222']
(1, 2) ['8EF5CD1B', '96276D30', 'B1392DB3', 'BD23F32A', '59770CD6', '4EFA3222']
(2, 0) ['96276D30', '8EF5CD1B', 'B1392DB3', 'BD23F32A', '59770CD6', '4EFA3222']
(2, 1) ['8EF5CD1B', '96276D30', 'B1392DB3', 'BD23F32A', '59770CD6', '4EFA3222']
>>>magic(*LISTS)
['8EF5CD1B', '96276D30', 'B1392DB3', '59770CD6', 'BD23F32A', '4EFA3222']

正如您在(0,1) 中看到的那样,96276D30 排在第二位,因为b 列表填补了那里的空白。在订单冲突的情况下,优先级转到第一个参数。魔术函数应该使用两个以上的参数,就像上面的例子一样。我编写了一个有效的代码,但对于像这样看似简单的任务来说,它实在是太丑了(而且可能太慢了)。

MAX_ITERATIONS = 1000
class UnjoinableListsError(Exception): pass

def magic(*lists, iterations=MAX_ITERATIONS):
    """
    Returns a joint sorted list of presorted lists (or tuples).

    First it checks for common items, then it defines a gap list to put
    non-commons in. Finally it mixes them all. If items of more presorted
    list (or tuple) competes for a gap place, they will sorted in order
    of their parents were in arguments.
    """
    def sort_two(first, second):
        commons = [item for item in first if item in second]
        gap_list = [[] for i in range(len(commons)+1)]
        for l in (first, second):
            gap_item = []
            sliced = []
            for common_item in commons:
                common_i = l.index(common_item)
                sliced.append((list(l[:common_i]), list(l[common_i+1:])))
            gap_item.append(sliced[0][0])
            for j in range(len(sliced) - 1):
                gap_item.append([item for item in sliced[j][1]
                                    if item in sliced[j+1][0]])
            gap_item.append(sliced[-1][1])
            for j, item in enumerate(gap_item):
                gap_list[j].extend([i for i in item if i not in commons])
        result = []
        result.extend(gap_list[0])
        for i in range(len(commons)):
            result.append(commons[i])
            result.extend(gap_list[i+1])
        return result

    result = lists[0]
    index_set = {i for i in range(1, len(lists))}
    it = iterations
    while index_set and it > 0:
        it -= 1
        if it == 0:
            raise UnjoinableListsError('The lists at argument index {}'+
                'are unjoinable.'.format(str(index_set)))
        i = index_set.pop()
        try:
            result = sort_two(result, lists[i])
        except:
            index_set.add(i)
    return result

我错过了一些清晰而简单的解决方案吗?感谢您的回答。

【问题讨论】:

  • “假设我有以下排序列表:”。你确定它们是排序的吗?在列表 a 中,“59770CD6”位于“BD23F32A”之前。在列表 c 中,“59770CD6”位于“BD23F32A”之后。
  • 那些是对象引用的 crc32 哈希值。是的,它们是预先分类的。否则我可以轻松使用heapq.merge()

标签: sorting python-3.x merge mergesort


【解决方案1】:

好吧,没有答案让我对此感到不满。这是运行良好的代码:

def joint_sorted(*sequences):
    """Sorts two or more presorted sequences. The priority is in
decreasing order for the case of unambiguous elem order.

>>> joint_sorted([1,3,4,5,6], [1,2,4,6,7], [6, 10, 11], [12, 11, 17])
[1, 3, 2, 4, 5, 6, 7, 10, 12, 11, 17]
>>> joint_sorted('adgth', 'dbgjhk')
'adbgtjhk'"""
    def for_two(first_seq, second_seq):
        first_set, second_set = set(first_seq), set(second_seq)
        if (len(first_seq) != len(first_set)
            or len(second_seq) != len(second_set)):
            raise TypeError("The sequences must contain "
                "unique elems only!")
        common_elems = first_set & second_set
        before, buf = {}, []
        for i, e in iter(enumerate(second_seq)):
            if e in common_elems:
                before[e], buf = buf, []
            else:
                buf.append(e)
        result = []
        for e in first_seq:
            if e in before:
                result.extend(before[e])
            result.append(e)
        result.extend(buf)
        if isinstance(first_seq, str):
            return ''.join(result)
        return first_seq.__class__(result)
    first_seq = sequences[0]
    for i in range(1, len(sequences)):
        first_seq = for_two(first_seq, sequences[i])
    return first_seq

【讨论】:

    猜你喜欢
    • 2012-12-15
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2015-04-03
    • 2018-03-08
    • 2011-02-27
    • 1970-01-01
    • 1970-01-01
    相关资源
    最近更新 更多