Evaluating a Checker Board with a Feed Forward Neural Network
With two-player zero-sum games such as checkers and chess, it is common to build computer players based on the idea of searching the game tree to select moves using some variant of the minimax search algorithm and an evaluation function. The evaluation function takes a game board as input and returns a numeric score as output, where the score is highest when the board is a win for the player, and the score is lowest when the board is a loss for the player. For checkers, an extremely simple example of an evaluation function just adds up the number of pieces the player has, and subtracts the number of pieces the opponent has.
When using a neural network to evaluate a checkers board, we need some representation of the checkers board and the pieces on it. Fogel and Chellapilla chose to map the board to an array of 32 floating point numbers, where each entry had one of five values:
K was an evolvable value, initialized to 2.0.
In addition to the above, in their second and third experiments, they computed a value directly from the inputs (i.e. without weights): they summed the 32 floating point inputs, to come up with a value that reflects the piece differential between the two players.
We can imagine other mappings. For example, we might have several binary inputs per square, such as:
Fogel and Chellapilla chose a search depth of 4 ply (2 moves by each player), but extended the depth whenever a move by a player was "forced". Unfortunately I found their description of how this was handled a bit vague:
My best guess is that whenever the player, and not the opponent, had only jumps available (whether they had a choice of jumps or not), the search depth on that branch of the game tree was increased by 2 ply.
UPDATE 2010-12-09: Dr. Fogel was kind enough to respond to my questions about forced jumps, and he recalls that this meant the player whose choice it was had only a single move, and thus had no real choice. This applied for either player in the game tree, and it didn't matter whether the move was a simple move or was a jump.
When using a neural network to evaluate a checkers board, we need some representation of the checkers board and the pieces on it. Fogel and Chellapilla chose to map the board to an array of 32 floating point numbers, where each entry had one of five values:
- +K Player's King
- +1 Player's Man
- 0 Empty
- -1 Opponent's Man
- -K Opponent's King
K was an evolvable value, initialized to 2.0.
In addition to the above, in their second and third experiments, they computed a value directly from the inputs (i.e. without weights): they summed the 32 floating point inputs, to come up with a value that reflects the piece differential between the two players.
We can imagine other mappings. For example, we might have several binary inputs per square, such as:
- 3 bits: black, white, king; at most one of black or white would be set, and king is only set if one of the other two is set.
- 4 bits: empty, black, white, king; at most one of empty, black or white would be set, and king is only set if black or white is set.
- 4 bits: black king, black, white, white king; at most one would be set at a time.
- 5 bits: black king, black, empty, white, white king; exactly one would be set at a time.
Fogel and Chellapilla chose a search depth of 4 ply (2 moves by each player), but extended the depth whenever a move by a player was "forced". Unfortunately I found their description of how this was handled a bit vague:
In addition, when forced moves were involved, the search depth was extended (let f be the number of forced moves) because in these situations the player has no real decision to make. The ply depth was extended by steps of two, up to the smallest even number that was greater than or equal to the number of forced moves f that occurred along that branch. If the extended ply search produced more forced moves then the ply was once again increased in a similar fashion. Furthermore, if the final board position was left in an "active" state, where the player has a forced jump, the depth was once again incremented by two ply. Maintaining an even depth along each branch of the search tree ensured that the boards were evaluated after the opponent had an opportunity to respond to the player's move.It wasn't clear to me whether this test is performed only on the moves of the player, or also on the moves of the opponent. Furthermore, it wasn't clear what the definition of "forced move" was. Did it mean that the player to move had only jumps to choose among (i.e. possibly more than one); or did it mean that the player to move had only a single move that was possible, whether it was a simple move or a jump; or did it mean that the player to move had only a single jump available.
My best guess is that whenever the player, and not the opponent, had only jumps available (whether they had a choice of jumps or not), the search depth on that branch of the game tree was increased by 2 ply.
UPDATE 2010-12-09: Dr. Fogel was kind enough to respond to my questions about forced jumps, and he recalls that this meant the player whose choice it was had only a single move, and thus had no real choice. This applied for either player in the game tree, and it didn't matter whether the move was a simple move or was a jump.
Comments
Post a Comment