Python 中的 Minimax AI答案

【问题标题】：Minimax AI in pythonPython 中的 Minimax AI
【发布时间】：2020-03-16 03:31:58
【问题描述】：

我正在尝试创建一个 minimax 类型的 AI，它会经历 4 层动作，并尝试根据某种启发式方法挑选出可能的最佳动作。如果我到达一个非法移动的节点，那么事情就在我的状态机中，然后我返回值 None 而不是我的启发式函数将给出的正常点值。在我的 minimax 函数中处理这个问题时，我有点不确定如何以最好的方式处理它。到目前为止，它看起来像这样，并且想知道这是否有意义。

def ai_min_max(board, ai_mancala, player_mancala, ai_choices, player_choices, target_depth, cur_depth, maxTurn, position):
    #base case where we call our heuristic function to tell us what the value of this state is
    if cur_depth == target_depth :
        #return the heuristic value for this state
        return first_heuristic(board, ai_mancala, player_mancala, ai_choices, player_choices, position)

    #if we are currently on a level where we are maximizing our function
    if maxTurn :
        #set the value to negative infinity
        max_eval = float("-inf")
        #go through the 10 possible choices you can make
        for x in range(len(ai_choices)) :
            new_position = position + [x]
            my_eval = ai_min_max(board, ai_mancala, player_mancala, ai_choices, player_choices, target_depth, cur_depth +1, False, new_position)
            #update the current max only if we have a valid movement, if not then do not update
            if my_eval is not None:
                max_eval = max(max_eval, my_eval)
        if max_eval == float("-inf") :
            return float("inf")
        return max_eval

    #if it is the minimizing player's turn
    else :
        min_eval = float("inf")
        for x in range(len(player_choices)) :
            new_position = position + [x]
            my_eval = ai_min_max(board, ai_mancala, player_mancala, ai_choices, player_choices, target_depth, cur_depth +1, True, new_position)
            if my_eval is not None:
                min_eval = min(min_eval, my_eval)
        #if there were no valid moves
        if min_eval == float("inf") :
            return float("-inf")
        return min_eval

【问题讨论】：

标签： python-3.x artificial-intelligence numpy-ndarray minmax

【解决方案1】：

通常在极小极大实现中，您实际上永远不会对非法移动进行递归调用 - 这些非法移动从一开始就不会生成。但是，在某些情况下，实际应用这一举措以查明它是否合法可能更容易（或更便宜）。例如，如果您必须应用复杂的计算来确定移动是否合法，那么您不希望这样做两次（一次是在您生成潜在移动时，一次是在您搜索它们时）。所以，我假设这里就是这种情况。

鉴于此，在上面的代码中返回一个特殊值是否有意义。

不，有更好的方法。在最小节点，当移动非法时，您可以将 -inf 返回给父节点，在最大节点，您可以将 inf 返回给父节点。这样一来，非法动作的价值就更差了，其余的搜索自然会处理，没有任何其他特殊情况。这使得主 minimax/alpha-beta 循环更加简单。

唯一的复杂情况是，如果根部的最大玩家的所有动作都失败了，它可能会返回非法动作。您可以在主搜索之外处理这种情况 - 与完整搜索相比，测试单个移动非常便宜 - 如果返回的移动是非法的，则只需返回任何合法移动。

【讨论】：

非常感谢，这一切都很有意义！我最终是如何做到的，这正是你所说的，将 -inf 和正 inf 返回到我之前的函数调用！