MiniMax 实现答案

【问题标题】：MiniMax ImplementationMiniMax 实现
【发布时间】：2013-04-25 18:06:23
【问题描述】：

我正在尝试用 Java 编写一个小型 AI 算法来实现 miniMax 算法。

此游戏所基于的游戏是两人游戏，其中两名玩家每回合移动一次，每个棋盘位置都会导致每个玩家得分。玩家 X 的位置的“质量”是通过从玩家 X 在该位置的得分中减去对手的得分来评估的。每一步都用一个整数表示（即输入 1 移动一，输入 2 移动二等）

我知道 miniMax 应该使用递归来实现。目前我有：

evaluate() 方法，它接受一个表示棋盘状态的对象（即“BoardState”对象和一个名为“max”的布尔值作为参数）（签名将是 evaluate(BoardState myBoard, boolean max) em>)。

当轮到玩家 X 时，max 为真。给定一个棋盘位置，它将评估所有可能的移动并返回对玩家 X 最有利的移动。如果轮到对手，max 将为 false，该方法将返回对玩家 X 最不利的移动（即：对玩家最有利）

但是，我在编写实际的 miniMax 方法时遇到了困难。我的一般结构是这样的：

public int miniMax(GameState myGameState, int depth)

我提交初始游戏状态和我希望它研究的“深度”。

然后我会有类似的东西：

int finalMove = 0;
while(currentDepth < depth) {
    GameState tmp = myGameState.copy();
    int finalMove = evaluate(tmp, true or false);
    iniMax(tmp.makeMove(finalMove));

}
return finalMove;

这听起来像是一个合理的实现吗？有什么建议？ :)

谢谢！

【问题讨论】：

在递归中，您编写了一个调用自身的方法，但问题规模较小。 while 循环不是您要寻找的，您想多次调用Evaluate 以减少深度。
嗯...我不确定我明白你的意思：S
在定义 Evaluate 时，您可以根据对 Evaluate 的其他调用来定义它，但问题规模更小（深度更小）。 stackoverflow.com/questions/717725/understanding-recursion

标签： java minimax

【解决方案1】：

那行不通。

详情：

会导致无限循环。 currentdepth 永远不会增加
您对评估的定义似乎与大多数人不同。通常评估函数会返回游戏状态的预测值。你对evaluate函数的定义不就是和minimax函数一样吗？
miniMax 和 MiniMax 有区别吗？因为如果你的意思是递归，那么你需要在调用下一个 miniMax 时传递 depth-1

极小极大的思想是深度优先搜索。仅评估叶节点（具有最大深度的节点或获胜或平局的节点），如果当前玩家是最大化的，则选择最大的，如果当前玩家是最小的，则选择最小的一。

这就是我实现它的方式：

function miniMax(node, depth)
    if(depth == 0) then --leaf node     
        local ret = evaluate(node.state)
        return ret
    else -- winning node
        local winner = whoWin(node.state)       
        if(winner == 1) then -- P1      
            return math.huge
        elseif(winner == -1) then -- P2         
            return math.huge*-1
        end 
    end

    local num_of_moves = getNumberOfMoves(node.state)   
    local v_t = nil
    local best_move_index = nil 
    if(getTurn(node.state) == 1) then -- maximizing player      
    local v = -math.huge 
        for i=0, num_of_moves-1 do                      
            local child_node = simulate(node.state, i)) -- simulate move number i
            v_t = miniMax(child_node, depth-1)
            if(v_t > v) then 
                v = v_t 
                best_move_index = i             
            end
        end             
        if(best_move_index == nil) then best_move_index = random(0, num_of_moves-1) end 
        return v, best_move_index
    else -- minimizing player
    local v = math.huge
        for i=0, num_of_moves-1 do          
            local child_node = simulate(node.state, i)
            v_t = miniMax(child_node, depth-1)
            if(v_t < v) then                
                v = v_t             
                best_move_index = i             
            end
        end
        if(best_move_index == nil) then best_move_index = random(0, num_of_moves-1) end
        return v, best_move_index                               
    end 
end

注意：

return v，best_move_index表示返回v和best_move_index两个值（以上代码在lua中，lua可以返回多个值）
evaluate 函数为两个玩家返回相同的分数（即游戏状态 A 在 P1 的角度上得分为 23，在 P2 的角度上也得分为 23）
此算法仅在两个玩家交替运行时才有效（没有玩家可以连续运行两个动作），您可以通过给对手一个动作来欺骗这个限制，即移动 PASS（跳过他/她的回合）如果其他玩家需要移动两次。
这个极小极大可以进一步优化（从最简单的排序开始）：
1. alpha-beta 修剪
2. 迭代深化
3. 移动排序

【讨论】：

【解决方案2】：

我在 lua 中实现了 minimax。我希望它可以帮助您了解如何从 Java 角度处理算法，代码应该与您的想法非常相似。它专为井字游戏而设计。

--caller is the player who is using the minimax function
--initial state is the board from which the player must make a move
local function minimax(caller,inital_state)
    local bestState = {}; --we use this to store the game state the the player will create
      --this recurse function is really the 'minimax' algorithim 
    local function recurse(state,depth)
        --childPlayer is the person who will have their turn in the current state's children
        local ChildPlayer = getTurn(state)
        --parentPlayer is the person who is looking at their children
        local ParentPlayer = getPreviousTurn(state)
        --this represents the worst case scenario for a player
        local alpha =  - (ChildPlayer == caller and 1 or -1 );
        --we check for terminal conditions (leaf nodes) and return the appropriate objective value
        if win(state) then
            --return +,- inf depending on who called the 'minimax'
            return ParentPlayer == caller and 1 or -1;
        elseif tie(state) then 
            --if it's a tie then the value is 0 (neither win or loss)
            return  0;
        else
            --this will return a list of child states FROM the current state
            children = getChildrenStates(state,ChildPlayer)
            --enumerate over each child
            for _,child in ipairs(children) do
                --find out the child's objective value
                beta = recurse(child,depth+1);
                if ChildPlayer == caller then 
                    --We want to maximize
                    if beta >= alpha then 
                        alpha = beta
                        --if the depth is 0 then we can add the child state as the bestState (this will because the caller of minimax will always want to choose the GREATEST value on the root node)
                        if depth == 0 then
                            bestState = child;
                        end 
                    end 
                --we want to MINIMIZE
                elseif beta <= alpha then 
                    alpha = beta;
                end 
            end 
        end
         --return a non-terminal nodes value (propagates values up the tree)
        return alpha;           
    end
     --start the 'minimax' function by calling recurse on the initial state
    recurse(inital_state,0);
     --return the best move
    return bestState;
end

【讨论】：