如何记住游戏树（可能无限的玫瑰树）的重复子树？答案

【问题标题】：How to memoize the repeated subtrees of a game tree (a potentially infinite rose tree)?如何记住游戏树（可能无限的玫瑰树）的重复子树？
【发布时间】：2019-07-22 20:32:56
【问题描述】：

我正在尝试在 Haskell 中实现 Negamax 算法。

为此，我代表的是游戏在玫瑰树中的未来可能性 (Data.Tree.Forest (depth, move, position))。但是，通常可以通过两种不同的移动顺序到达某些位置。重新评估重复位置（的子树）是一种浪费（并且很快变得非常缓慢）。

这是我目前尝试过的：

实现Tying the Knot 的变体以共享公共子结果。但是，我只能找到为（可能是无限的）列表打结的解释，而没有找到关于重用子树的解释。
我考虑过的另一种方法是在 State monad 内构建一棵树，其中要保留的状态将是 Map (depth, position) (Forest (depth, move, position)) 以执行显式记忆，但到目前为止我无法设置它也可以。

我认为这两种方法都可能存在一个问题，即游戏树只能以 corecursive 方式构建：我们不会从叶子构建树直到根，而是构建一个 (可能是无限的）树懒惰地从根向下。

编辑：给你一个我目前正在使用的代码示例（太慢了）：

{-# LANGUAGE TypeFamilies #-}
{-# LANGUAGE FlexibleContexts #-}
module ZeroSumGame where

import qualified Control.Arrow
import Data.Tree

import Numeric.Natural (Natural)

(|>) :: a -> (a -> b) -> b
x |> f = f x
infixl 0 |>
{-# INLINE (|>) #-}

class Ord s => Game s where
  data Move s
  initial :: s -- | Beginning of the game
  applyMove :: Natural -> s -> Move s -> s -- | Moving from one game state to the next
  possibleMoves :: Natural -> s -> [Move s] -- | Lists moves the current player is able to do.
  isGameOver :: s -> Bool -- | True if the game has ended. TODO: Maybe write default implementation using `possibleMoves state == []`?
  scorePosition :: Natural -> Move s -> s -> Int -- | Turns a position in an integer, for the Negamax algorithm to decide which position is the best.

type Trimove state = (Natural, Move state, state) -- | Depth since start of game, move to next position, new position

gameforest :: Game s => Natural -> s -> Forest (Trimove s)
gameforest start_depth start_state = unfoldForest buildNode (nextpositions start_depth start_state)
  where
    buildNode (depth, move, current_state) =
      if
        isGameOver current_state
      then
        ((depth, move, current_state), [])
      else
        ((depth, move, current_state), nextpositions depth current_state)
    nextpositions depth current_state =
      current_state
      |> possibleMoves depth
      |> fmap (\move -> (succ depth, move, applyMove depth current_state move))

scoreTree :: Game s => Ord (Move s) => Natural -> Tree (Trimove s) -> (Move s, Int)
scoreTree depth node =
  case (depth, subForest node) of
    (0, _) ->
      node |> rootLabel |> uncurry3dropFirst scorePosition
    (_, []) ->
      node |> rootLabel |> uncurry3dropFirst scorePosition
    (_, children) ->
      children
      |> scoreForest (pred depth)
      |> map (Control.Arrow.second negate)
      |> maximum

uncurry3dropFirst :: (a -> b -> c -> d) -> (a, b, c) -> (b, d)
uncurry3dropFirst fun (a, b, c) = (b, fun a b c)

scoreForest :: Game s => Ord (Move s) => Natural -> Forest (Trimove s) -> [(Move s, Int)]
scoreForest depth forest =
  forest
  |> fmap (scoreTree depth)

{-# LANGUAGE TypeFamilies #-}
{-# LANGUAGE FlexibleContexts #-}
module TicTacToe where

import qualified Control.Monad.State
import Control.Monad.State (State)
import qualified Data.Map
import Data.Map (Map)
import qualified Control.Arrow
import Data.Tree

import Data.Array (Array)
import qualified Data.Array
import qualified Data.Maybe
import qualified Data.Foldable

import Numeric.Natural (Natural)


import ZeroSumGame

data CurrentPlayer = First | Second
  deriving (Eq, Ord, Show)


instance Enum CurrentPlayer where
  fromEnum First = 1
  fromEnum Second = -1
  toEnum 1 = First
  toEnum (-1) = Second
  toEnum _ = error "Improper player"

newtype TicTacToe = TicTacToe (Array (Int, Int) (Maybe CurrentPlayer))
  deriving (Eq, Ord)

instance Game TicTacToe where
  data Move TicTacToe = TicTacToeMove (Int, Int)
    deriving (Eq, Ord, Show, Bounded)

  initial = TicTacToe initialTicTacToeBoard

  possibleMoves _depth = possibleTicTacToeMoves

  applyMove depth (TicTacToe board) (TicTacToeMove (x, y)) =
    TicTacToe newboard
    where
      newboard = board Data.Array.// [((x, y), Just player)]
      player = case depth `mod` 2 of
        0 -> First
        _ -> Second

  isGameOver state = Data.Maybe.isJust (findFilledLines state)

  scorePosition _ _ state =
          state
          |> findFilledLines
          |> fmap fromEnum
          |> Data.Maybe.fromMaybe 0
          |> (* (-10000))



findFilledLines :: TicTacToe -> Maybe CurrentPlayer
findFilledLines (TicTacToe board) =
  (rows ++ columns ++ diagonals)
  |> map winner
  |> Data.Foldable.asum
  where
    rows = vals rows_indexes
    columns = vals columns_indexes
    diagonals = vals diagonals_indexes
    rows_indexes = [[(i, j) | i <- [0..2]]| j <- [0..2]]
    columns_indexes = [[(i, j) | j <- [0..2]]| i <- [0..2]]
    diagonals_indexes = [[(i, i) ]| i <- [0..2]] ++ [[(i, 2 - i) ]| i <- [0..2]]
    vals = map (map (\index -> board Data.Array.! index))

winner :: Eq a => [Maybe a] -> Maybe a
winner [x,y,z] =
  if x == y && x == z then x else Nothing
winner _ = Nothing


initialTicTacToeBoard :: (Array (Int, Int) (Maybe CurrentPlayer))
initialTicTacToeBoard =
  Data.Array.array ((0, 0), (2, 2)) [((i, j), Nothing) | i <- [0..2], j <- [0..2]]

possibleTicTacToeMoves :: TicTacToe -> [Move TicTacToe]
possibleTicTacToeMoves (TicTacToe board) = foldr checkSquareForMove [] (Data.Array.assocs board)
    where
      checkSquareForMove (index, val) acc = case val of
        Nothing -> TicTacToeMove index : acc
        Just _ -> acc

printBoard :: TicTacToe -> String
printBoard (TicTacToe board) =
  unlines [unwords [showTile (board Data.Array.! (y, x)) | x <- [0..2]] |  y <- [0..2]]
  where
    showTile loc =
      case loc of
        Nothing -> " "
        Just Second -> "X"
        Just First -> "O"

（TypeFamilies 用于允许每个Game 实现有自己的Move 概念，然后需要FlexibleContexts 来强制Move s 实现Ord。

【问题讨论】：

这是一个非常酷的问题！可惜我帮不了你...

标签： haskell memoization state-monad multiway-tree negamax

【解决方案1】：

问题重新表述

如果我正确理解了这个问题，那么您有一个函数可以返回游戏中可能的下一步动作，以及采取该动作的函数：

start :: Position
moves :: Position -> [Move]
act :: Position -> Move -> Position

以及您希望如何构建无限状态树（为简单起见，请允许我忽略 Depth 字段。如果您将深度计数器视为 Position 类型的一部分，您会发现没有通用性丢在这里）：

states :: Forest (Position, Move)
states = forest start

forest :: Position -> Forest (Position, Move)
forest p = [ Node (m, p') (states p') | m <- moves p, let p' = act p m ]

但您希望以共享forest 的相同子树的方式实现这一目标。

迈向记忆

这里的一般技术是我们要记忆forest：这样，对于相同的Positions，我们得到共享子树。所以配方是：

forest :: Position -> Forest (Position, Move)
forest = memo forest'

forest' :: Position -> Forest (Position, Move)
forest' p = [ Node (m, p') (states p') | m <- moves p, let p' = act p m ]

我们需要一个合适的备忘录功能：

memo :: (Position -> a) -> (Position -> a)

此时，我们需要了解更多关于 Position 的信息，以便了解如何使用等效的“惰性列表”技巧来实现它……但您会发现您确实不需要记忆涉及玫瑰树的函数。

【讨论】：

有趣的答案！深度主要与跟踪轮到哪个玩家有关，但你是对的，没有失去一般性。您提到的memo 之类的功能如何能够执行记忆？一些幕后unsfafePerform诡计？
@Qqwy，其实不需要hack，很漂亮。请参阅有关 memoization 的 haskellwiki 页面。有几个库可以提供帮助——data-memocombinators 和 MemoTrie ...取决于你的审美
@Qqwy，Position的类型是什么？然后我们可以看一下。或者查看现有的备忘录组合器，就像 @luqui 建议的那样。
@JoachimBreitner 在例如的情况下井字游戏，位置将输入为newtype TicTacToe = TicTacToe (Array (Int, Int) (Maybe CurrentPlayer)) deriving (Eq, Ord)。然而，当我将它构建为一个类型类时，position 是游戏之间不同的两种类型之一。也许共享类型类的接口更有帮助。（由于代码太长，无法评论，请参阅编辑后的问题）

【解决方案2】：

我会尝试通过基于一些“规范”移动序列来标准化棋盘位置来达到该位置。然后为每个孩子分配一个值，即在树中遍历其各自的标准化序列。（没有代码，因为我在手机上，这是一项艰巨的任务。）

这种方法的效果取决于在您正在玩的游戏中计算标准化移动序列的难易程度。但这是一种通过打结来引入共享的方法，利用对博弈树根的共享引用。也许它会成为适合您特定情况的其他想法的灵感。

【讨论】：

我打算将 negamax 算法创建为可以用于任何零和游戏（如国际象棋、跳棋、井字游戏、四连线、黑白棋、将棋、象棋等）的类型类。我不知道是否有可能为这些游戏的每一个位置定义一个“规范序列”。
国际象棋引擎通常用来对棋盘位置进行哈希处理的方法称为Zobrist hashing。也许类似的东西适合你的游戏。您可以定义一个扩展 negamax 类的类型类，为合适的游戏添加哈希函数。您可以查看astar package 的参数化泛型 A* 搜索那里的灵感。