【问题标题】:Implementing Alpha Beta into Minimax在 Minimax 中实现 Alpha Beta
【发布时间】:2013-09-10 10:27:13
【问题描述】:

我正在尝试将 Alpha Beta 修剪添加到我的极小值中,但我不明白哪里出错了。

目前我正在经历 5,000 次迭代,据一位朋友说,我应该经历大约 16,000 次。选择第一个位置时,它返回 -1(失败),而此时它应该能够肯定返回 0(平局),因为它应该能够从空板中抽奖,但是我看不到当我遵循我的代码时,我哪里出错了,这似乎很好

奇怪的是,如果我在检查中切换返回 Alpha 和 Beta(以实现返回 0),计算机将尝试绘制但不会启动任何获胜动作,只会阻止

我的逻辑流程

如果我们正在寻找 alpha: 如果分数 > alpha,则更改 alpha。如果 alpha 和 beta 重叠,则返回 alpha

如果我们正在寻找测试版: 如果分数

这是我的 递归调用

int MinimaxAB(TGameBoard* GameBoard, int iPlayer, bool _bFindAlpha, int _iAlpha, int _iBeta) 
{

    //How is the position like for player (their turn) on iGameBoard?
    int iWinner = CheckForWin(GameBoard);
    bool bFull = CheckForFullBoard(GameBoard);

    //If the board is full or there is a winner on this board, return the winner
    if(iWinner != NONE || bFull == true) 
    {
        //Will return 1 or -1 depending on winner
        return iWinner*iPlayer;
    }

    //Initial invalid move (just follows i in for loop)
    int iMove = -1;
    //Set the score to be instantly beaten
    int iScore = INVALID_SCORE;

    for(int i = 0; i < 9; ++i)
    {
        //Check if the move is possible
        if(GameBoard->iBoard[i] == 0) 
        {
            //Put the move in
            GameBoard->iBoard[i] = iPlayer;

            //Recall function
            int iBestPositionSoFar = -MinimaxAB(GameBoard, Switch(iPlayer), !_bFindAlpha, _iAlpha, _iBeta);

            //Replace Alpha and Beta variables if they fit the conditions - stops checking for situations that will never happen
            if (_bFindAlpha == false)
            {
                if (iBestPositionSoFar < _iBeta)
                {
                    //If the beta is larger, make the beta smaller
                    _iBeta = iBestPositionSoFar;
                    iMove = i;

                    if (_iAlpha >= _iBeta)
                    {
                        GameBoard->iBoard[i] = EMPTY;

                        //If alpha and beta are overlapping, exit the loop
                        ++g_iIterations;
                        return _iBeta;

                    }
                }
            }
            else
            {
                if (iBestPositionSoFar > _iAlpha)
                {
                    //If the alpha is smaller, make the alpha bigger
                    _iAlpha = iBestPositionSoFar;
                    iMove = i;

                    if (_iAlpha >= _iBeta)
                    {
                        GameBoard->iBoard[i] = EMPTY;

                        //If alpha and beta are overlapping, exit the loop
                        ++g_iIterations;
                        return _iAlpha;
                    }
                }
            }

            //Remove the move you just placed
            GameBoard->iBoard[i] = EMPTY;
        }
    }


    ++g_iIterations;

    if (_bFindAlpha == true)
    {
        return _iAlpha;
    }
    else
    {
        return _iBeta;
    }
}

初始调用(当计算机应该选择一个位置时)

int iMove = -1; //Invalid
int iScore = INVALID_SCORE;

for(int i = 0; i < 9; ++i) 
{
    if(GameBoard->iBoard[i] == EMPTY) 
    {
        GameBoard->iBoard[i] = CROSS;
        int tempScore = -MinimaxAB(GameBoard, NAUGHT, true, -1000000, 1000000);
        GameBoard->iBoard[i] = EMPTY;

        //Choosing best value here
        if (tempScore > iScore)
        {
            iScore = tempScore;
            iMove = i;
        }
    }
}
//returns a score based on Minimax tree at a given node.
GameBoard->iBoard[iMove] = CROSS;

任何有关我的逻辑流程的帮助,可以使计算机返回正确的结果并做出明智的举动,我们将不胜感激

【问题讨论】:

    标签: c++ minimax alpha-beta-pruning


    【解决方案1】:

    您的算法在没有 alpha-beta 修剪的情况下是否能完美运行?您的初始调用应使用false for _bFindAlpha 进行,因为根节点的行为类似于 alpha 节点,但看起来这不会产生影响:

    int tempScore = -MinimaxAB(GameBoard, NAUGHT, false, -1000000, 1000000);
    

    因此我会建议你放弃这个_bFindAlpha 废话并将你的算法转换为negamax。它的行为与 minimax 相同,但使您的代码更短更清晰。您可以在递归调用时交换和取反,而不是检查是最大化 alpha 还是最小化 beta(这与您现在可以返回函数的否定值的原因相同)。这是维基百科伪代码的略微编辑版本:

    function negamax(node, α, β, player)
        if node is a terminal node
            return color * the heuristic value of node
        else
            foreach child of node
                val := -negamax(child, -β, -α, -player)
                if val ≥ β
                    return val
                if val > α
                    α := val
            return α
    

    除非您喜欢单步执行搜索树,否则我认为您会发现编写一个干净、正确的 negamax 版本比调试您当前的实现更容易。

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 2015-10-15
      • 1970-01-01
      • 2017-03-22
      • 1970-01-01
      • 1970-01-01
      相关资源
      最近更新 更多