The Minimax algorithm is the core of several game-playing AI for making decisions on the best move. This algorithm finds the best move for an AI on a two-player, combinatorial game state on games such as Checkers, Chess or Othello.
In this post, I assume that the reader is familiar with the algorithm and its inherent code size due to its frequent implementation with three distinct functions. I will begin by briefly describing a standard implementation of Minimax and then I will introduce a concise implementation using higher-order functions. Note that we will use Python and Haskell as pseudo-code.
Two-Player, Combinatorial Game Representation
We will begin with a simple description of two-player, combinatorial games abstractly. Games that can be categorized in this way include Checkers, Chess and Othello. However, we will focus on the abstraction of these games by representing their game state with a minimal interface necessary such that the Minimax algorithm can be applied to it. That is, the game should
- Provide a set of available moves given a specified game state,
- Obtain the next state of a game given a current state and move
- And determine whether the game is over.
Although it may be more coherent to use Python pseudo-code, Python does not have a syntax for interfaces; consequently, we will outline the interface using Haskell's syntax for a typeclass.
class Game a get_available_moves :: a -> [Move] next_state :: a -> Move -> a is_gameover :: a -> Bool
Now that we know methods associated with a particular game state, we may access them as a standard method call in Python:
Standard Implementation of Minimax
The standard implementation of the Minimax algorithm frequently includes three functions:
max_play(game_state). We will begin with the
minimax(game_state) declaration. Note that I use Python here as working pseudo-code.
def minimax(game_state): moves = game_state.get_available_moves() best_move = moves best_score = float('-inf') for move in moves: clone = game_state.next_state(move) score = min_play(clone) if score > best_score: best_move = move best_score = score return best_move
To summarize, Minimax is given a game state, obtains a set of valid moves from the game state, simulates all valid moves on clones of the game state, evaluates each game state which follows a valid move and finally returns the best move.
The following two helper functions simulate play between both the opposing player and the current player through the
max_play procedures respectively. With the aid of these two helper functions, the entire game tree is traversed recursively given the current state of the game.
def min_play(game_state): if game_state.is_gameover(): return evaluate(game_state) moves = game_state.get_available_moves() best_score = float('inf') for move in moves: clone = game_state.next_state(move) score = max_play(clone) if score < best_score: best_move = move best_score = score return best_score def max_play(game_state): if game_state.is_gameover(): return evaluate(game_state) moves = game_state.get_available_moves() best_score = float('-inf') for move in moves: clone = game_state.next_state(move) score = min_play(clone) if score > best_score: best_move = move best_score = score return best_score
In particular, the opponent intends to minimize the current player's score and the current player intends to maximize their own score. Note that the helper functions short-circuit and return early if the game is over.
Notice that the scores are calculated through the
evaluate(game_state)procedure. The implementation is omitted because it is dependent on the game itself; however, by convention, we say that the current player wins if the score is
INFand loses if the score is
There are 35 lines (sans the blank newlines) in the current implementation of our algorithm. We will then reduce the number of lines by a factor of two using higher-order functions.
Concise Implementation of Minimax
It is intuitive that Minimax intends to find the maximum of a set of scores and a minimum of a set of scores for the current player and the opposing player respectively. Hence, it is intuitive to invest in the
min() procedures which function exactly as we need them to.
Let us begin by modifying the
First, the opposing player must check if the game is over and evaluate the game state if necessary.
def min_play(game_state): if game_state.is_gameover(): return evaluate(game_state)
Second, the opposing player wants to return the minimum score of all of game states following valid moves.
def min_play(game_state): if game_state.is_gameover(): return evaluate(game_state) return min(scores) # Incomplete
We know how to obtain the set of valid moves and we know how to obtain the next game state given a valid mode; however, we want to return the set the scores associated with the game states which follow valid moves. Subsequently, we must
map() all of the game states which follow valid moves to a set of evaluations (or scores) that can be minimized.
def min_play(game_state): if game_state.is_gameover(): return evaluate(game_state) return min( map(lambda move: max_play(game_state.next_state(move)), game_state.get_available_moves())
This procedure is now complete; however, I will briefly overview a few key points of this procedure. We begin with a lambda as the function which takes a game state and returns the evaluations of the state. Furthermore, we
map() the set of game states which follow from valid moves to a set of evaluations that can be minimized.
map(fn, list)function applies a function, over a domain of type $A$ to a codomain of type $B$, to a list of type $A$:
$$fn : A \mapsto B$$
In our example, we map a list of game states to a list of evaluations.
Without loss of generality, the
max_play() is similarly defined except that it uses the
max() function for maximization. The further difficulty lies in the
minimax() procedure which has the additional requirement of returning a valid move rather than a score alone.
To account for the additional requirement in the
minimax() procedure, we modify the lambda of
map() and the key of
max() accordingly. That is, we begin by defining the lambda to be given a
move and return a tuple containing
(move, score). Furthermore, now that we have a tuple, we must decide which parameter to minimize over. We define this the key using the
key keyword argument of
max(). The final procedure is defined below:
def minimax(game_state): return max( map(lambda move: (move, min_play(game_state.next_state(move))), game_state.get_available_moves()), key = lambda x: x)
Now that all of the procedures have been redefined, we will see the final code:
def minimax(game_state): return max( map(lambda move: (move, min_play(game_state.next_state(move))), game_state.get_available_moves()), key = lambda x: x) def min_play(game_state): if game_state.is_gameover(): return evaluate(game_state) return min( map(lambda move: max_play(game_state.next_state(move)), game_state.get_available_moves())) def max_play(game_state): if game_state.is_gameover(): return evaluate(game_state) return max( map(lambda move: min_play(game_state.next_state(move)), game_state.get_available_moves()))
The total line count (sans the blank newlines) is 17. Without expanding the higher-order functions across several lines, there are 10 lines. In both accounts, the number of lines of code has been reduced by at least two-fold. For a working algorithm on an implementation of a game, see my Hexapawn GitHub repository.
It is easy to see that this implementation with higher-order functions is concise while maintaining readability of code (unlike our Perl friends in Code Golf). In general, several algorithms fit the problem that higher-order functions solve: composition of operations on larger data sets.
Focusing on the atomic components that an algorithm operates on such as elements of a set will not reduce the asymptotic lower-bound for the amount of code written. It is necessary that higher-level abstractions over such atomic elements exist to reduce the lower-bound for writing code. This notion has been explicated when the worst-case of sorting algorithms was reduced to less than (O(n\log n)) by using non-comparison-swap sorting algorithms. That is, sorting as simple comparison swap operations between two elements will never breach the (O(n\log n)) worst-case; however, Bucket sort has breached the worst-case by using operations outside of simple comparison and swap.
This particular implementation of Minimax could be reduced further into a single line of code using a fact that the Negamax variant of Minimax highlights for us:
$$max(a, b) = -min(-a, -b)$$
I leave this as an exercise for the ambitious.