了解 Negamax 的约束答案

【问题标题】：Understanding constraints of Negamax了解 Negamax 的约束
【发布时间】：2012-11-26 06:27:57
【问题描述】：

代码的sn-p是为计算tictactoe游戏中某个位置的bestMove而构建的。我几乎得到了代码的每一部分，除了 for 循环中的条件，它表示 minRating != LOSING_POSITION。此代码来自给定伪代码的实现。

moveT FindBestMove(stateT state, int depth, int & rating) {
for (*each possible move or until you find a forced win*) {
 *Make the move.
 Evaluate the resulting position, adding one to the depth indicator.
 Keep track of the minimum rating so far, along with the corresponding move.
 Retract the move to restore the original state.*
 }
*Store the move rating into the reference parameter.
Return the best move.*
}

我无法将 for 循环的第二个条件与给定的代码匹配，该代码表示 直到您找到强制获胜。我找不到这个事实和那个 minRating != LOSING_POSITION

之间的相似之处

moveT FindBestMove(stateT state, int depth, int & rating) {
Vector<moveT> moveList;
GenerateMoveList(state, moveList);
int nMoves = moveList.size();
if (nMoves == 0) Error("No moves available");
moveT bestMove;

int minRating = WINNING_POSITION + 1;

for (int i = 0; i < nMoves && minRating != LOSING_POSITION; i++) {

 moveT move = moveList[i];
 MakeMove(state, move);
 int curRating = EvaluatePosition(state, depth + 1);

 if (curRating < minRating) {
  bestMove = move;
  minRating = curRating;
  }

 RetractMove(state, move);
 }
rating = -minRating;
return bestMove;

}


int EvaluatePosition(stateT state, int depth) {
int rating;

if (GameIsOver(state) || depth >= MAX_DEPTH) {
 return EvaluateStaticPosition(state);
}

FindBestMove(state, depth, rating);
return rating;
}

【问题讨论】：

标签： c++ artificial-intelligence minimax

【解决方案1】：

你的程序从将WINNING_POSITION（我想你的对手赢了）分配给minRating开始，然后循环移动，试图找到伤害最大的移动，最小化minRating。

当EvaluatePosition 返回LOSING_POSITION 时，意味着这一步会导致你的对手在任何情况下都失败，因此，可以终止搜索，这一步被认为是最好的一步。

如果没有明显的LOSING_POSITIONS，那么你的算法会根据静态评估选择“最佳”移动。

【讨论】：

只是为了重申并明确一点，在 sn-p int curRating = EvaluatePosition(state, depth + 1); 中获得的最小值对谁有利，对当前玩家有利，对对手有利。我可以假设这是从当前球员的情况来看，否则他不会希望他的对手输。只有当评估函数给出 LOSING_POSITION 的值时才是循环停止的时候。我说的对吗，如果你想添加一些会很有帮助。
当无法移动时循环停止。或者当它很明显时，对手没有机会赢得他选择的任何动作。 curRating 越低越好。 minRating 相同。然而，就在Evaluate() 返回之前，rating 已经改变了符号，因为对手应该将他的等级移动到相反的方向，因此是“NegaMax”。