在子集总和问题中恢复子集 - 并非所有子集都出现答案

【问题标题】：Recovering Subsets in Subset Sum Problem - Not All Subsets Appear在子集总和问题中恢复子集 - 并非所有子集都出现
【发布时间】：2020-05-12 09:27:48
【问题描述】：

当我遇到这个问题时，我正在重温动态规划 (DP)。我设法使用 DP 来确定子集和问题中有多少解决方案。

def SetSum(num_set, num_sum):

   #Initialize DP matrix with base cases set to 1
   matrix = [[0 for i in range(0, num_sum+1)] for j in range(0, len(num_set)+1)]
   for i in range(len(num_set)+1): matrix[i][0] = 1

   for i in range(1, len(num_set)+1): #Iterate through set elements
       for j in range(1, num_sum+1):   #Iterate through sum
           if num_set[i-1] > j:    #When current element is greater than sum take the previous solution
               matrix[i][j] = matrix[i-1][j]
           else:
               matrix[i][j] = matrix[i-1][j] + matrix[i-1][j-num_set[i-1]]

   #Retrieve elements of subsets    
   subsets = SubSets(matrix, num_set, num_sum)

   return matrix[len(num_set)][num_sum]

基于Subset sum - Recover Solution，我使用以下方法检索子集，因为该集合将始终被排序：

def SubSets(matrix, num_set, num):

   #Initialize variables
   height = len(matrix)
   width = num
   subset_list = []
   s = matrix[0][num-1] #Keeps track of number until a change occurs

   for i in range(1, height):
       current = matrix[i][width]
       if current > s:
           s = current #keeps track of changing value
           cnt = i -1 #backwards counter, -1 to exclude current value already appended to list
           templist = []   #to store current subset
           templist.append(num_set[i-1]) #Adds current element to subset
           total = num - num_set[i-1] #Initial total will be sum - max element

           while cnt > 0:  #Loop backwards to find remaining elements
               if total >= num_set[cnt-1]: #Takes current element if it is less than total
                   templist.append(num_set[cnt-1])
                   total = total - num_set[cnt-1]
               cnt = cnt - 1

           templist.sort()
           subset_list.append(templist) #Add subset to solution set

   return subset_list

但是，由于它是一种贪心方法，因此它仅在每个子集的最大元素不同时才有效。如果两个子集具有相同的最大元素，则它只返回具有较大值的那个。因此，对于总和为 10 的元素 [1, 2, 3, 4, 5] 它只返回

[1, 2, 3, 4] , [1, 4, 5]

什么时候应该返回

[1, 2, 3, 4] , [2, 3, 5] , [1, 4, 5]

我可以在 while 循环中添加另一个循环以省略每个元素，但这会增加 O(rows^3) 的复杂性，这可能会超过实际的 DP，O(rows*columns)。是否有另一种方法可以在不增加复杂性的情况下检索子集？还是在 DP 方法发生时跟踪子集？我创建了另一种方法，可以检索 O(rows) 中解决方案子集中的所有唯一元素：

def RecoverSet(matrix, num_set):
   height = len(matrix) - 1
   width = len(matrix[0]) - 1
   subsets = []

   while height > 0:
       current = matrix[height][width]
       top = matrix[height-1][width]

       if current > top:
           subsets.append(num_set[height-1])
       if top == 0:
           width = width - num_set[height-1]
       height -= 1

   return subsets

这将输出 [1, 2, 3, 4, 5]。但是，从中获取实际子集似乎又要重新解决子集问题。关于如何存储所有解决方案子集（不打印它们）的任何想法/建议？

【问题讨论】：

标签： python dynamic-programming subset-sum

【解决方案1】：

这实际上是一个非常好的问题，但似乎你的直觉大多是正确的。

DP 方法允许您构建一个 2D 表并基本上编码多少子集总和为所需的目标总和，这需要时间 O(target_sum*len(num_set))。

现在，如果您想实际恢复所有解决方案，这是另一回事，因为解决方案子集的数量可能非常大，实际上比您在运行 DP 算法时构建的表要大得多。如果您想找到所有解决方案，可以使用该表作为指南，但可能需要很长时间才能找到所有子集。事实上，您可以通过定义表的递归（填满表时代码中的if-else）向后查找它们。这是什么意思？

假设您尝试找到解决方案，但您只能使用已填满的表格。判断是否有解决方案的第一件事是检查len(num_set) 行和num 列的元素的值> 0，表明至少有一个子集总和为num。现在有两种可能性，或者num_set 中的最后一个数字用于解决方案，在这种情况下，我们必须检查是否存在使用除最后一个之外的所有数字的子集，总和为num-num_set[-1]。这是递归中的一个可能分支。另一种情况是num_set 中的最后一个数字没有用于解决方案，在这种情况下，我们必须检查是否仍然可以找到一个解决方案来求和num，但是除了最后一个之外的所有数字。

如果你继续前进，你会发现可以通过向后递归来完成恢复。通过跟踪沿途的数字（因此表中导致所需总和的不同路径），您可以检索所有解决方案，但请再次记住，运行时间可能非常长，因为我们希望实际找到所有解决方案解决方案，而不仅仅是知道它们的存在。

这个代码应该是你正在寻找的给定填充矩阵的恢复解决方案：

def recover_sol(matrix, set_numbers, target_sum):
    up_to_num = len(set_numbers)
    
    ### BASE CASES (BOTTOM OF RECURSION) ###

    # If the target_sum becomes negative or there is no solution in the matrix, then 
    # return an empty list and inform that this solution is not a successful one
    if target_sum < 0 or matrix[up_to_num][target_sum] == 0:
        return [], False

    # If bottom of recursion is reached, that is, target_sum is 0, just return an empty list
    # and inform that this is a successful solution
    if target_sum == 0:
        return [], True
    
    ### IF NOT BASE CASE, NEED TO RECURSE ###

    # Case 1: last number in set_numbers is not used in solution --> same target but one item less
    s1_sols, success1 = recover_sol(matrix, set_numbers[:-1], target_sum)

    # Case 2: last number in set_numbers is used in solution --> target is lowered by item up_to_num
    s2_sols, success2 = recover_sol(matrix, set_numbers[:-1], target_sum - set_numbers[up_to_num-1])

    # If Case 2 is a success but bottom of recursion was reached
    # so that it returned an empty list, just set current sol as the current item
    if s2_sols == [] and success2:
        # The set of solutions is just the list containing one item (so this explains the list in list)
        s2_sols = [[set_numbers[up_to_num-1]]]

    # Else there are already solutions and it is a success, go through the multiple solutions 
    # of  Case 2 and add the current number to them
    else:
        s2_sols = [[set_numbers[up_to_num-1]] + s2_subsol for s2_subsol in s2_sols]

    # Join lists of solutions for both Cases, and set success value to True 
    # if either case returns a successful solution
    return s1_sols + s2_sols, success1 or success2

对于具有矩阵填充和恢复解决方案的完整解决方案，您可以这样做

def subset_sum(set_numbers, target_sum):
    n_numbers = len(set_numbers)

    #Initialize DP matrix with base cases set to 1
    matrix = [[0 for i in range(0, target_sum+1)] for j in range(0, n_numbers+1)]
    for i in range(n_numbers+1): 
        matrix[i][0] = 1

    for i in range(1, n_numbers+1): #Iterate through set elements
        for j in range(1, target_sum+1):   #Iterate through sum
            if set_numbers[i-1] > j:    #When current element is greater than sum take the previous solution
                matrix[i][j] = matrix[i-1][j]
            else:
                matrix[i][j] = matrix[i-1][j] + matrix[i-1][j-set_numbers[i-1]]
 
   return recover_sol(matrix, set_numbers, target_sum)[0]

干杯！

【讨论】：