Games as search

“max” and “min” two players. Max moves frist in the game. The game take turns until it’s over. Max uses search tree to determine “best” next move.

Definitions

  • initial state: set-up defined by rules
  • player(s): which player has the move in state s
  • actions(s): set of legal moves in state
  • results(s, a): transition model defines result of a move
  • terminal_test(s): true if the game is finished; false otherwise
  • utility(s, p): the numerical value of terminal state s for player p
    • win: 1, lose :-1, draw: 0

Procedure

  1. generate the whole game tree to leaves
  2. apply utility function to leaves
  3. back-up values from leaves toward the root:
    1. Max node computes the max of its child values
    2. Min node computes the min of its chidl values
  4. at root: choose move leading to the child of highest value

e.g.

minmax example

pesudo code

miniMaxSearch(state)
    return argmax([minValue(apply(state, a)) for each action a])


maxValue(state)
    if (terminal(state))
        return utility(state)
	v = -infty
	for each action a:
		v = max(v, minValue(apply(state, a)))
    return v


minValue(state)
    if (terminal(state))
        return utility(state)
	v = infty
	for each action a:
		v = min(v, maxValue(apply(state, a)))
    return v

Alpha-Beta Pruning in Minimax

If a position is provably bad, it’s no use to search the rest. If the adversary can force a bad position, it’s no use to search the rest because the adversary won’t let me achieve.

Begin at root (MAX), set: alpha = -infty for Max beta = infty for Min

for each child node: inherit alpha and beta, check its children: for each child: assign its value to beta, if beta < alpha, prune it: terminate this child’s search update parent’s alpha/beta: alpha = child’s beta, beta = child’s alpha update root’s alpha beta

Why?

Max only chooses the maximum number, Min only chooses the minimum number. Therefore, when beta(min) < alpha(max), Min will choose the beta’s value. There’s no use to search the rest children because no matter beta can be smaller or bigger, the situation won’t be better. Min only chooses smaller number. We already know beta is small.

pesudo code

abSearch(state):
	alpha, beta, a = -infty, +infty, None
	for each action a:
		alpha, a = max(alpha, a), (minValue(apply(state, a), alpha, beta), a)
    return a
    
maxValue(state, al, be)
	if (cutoff(state)) return eval(state);
	for each action a:
		al = max( al, minValue( apply(state, a), al, be ))
		if (al  be) return +infty
	return al
	
minValue(state, al, be)
	if (cutoff(state)) return eval(state);
	for each action a:
		be = min( be, maxValue( apply(state, a), al, be))
        if (al  be ) return -infty
	return be