Using counterfactual regret minimization to create a competitive multiplayer poker agent

  • Author / Creator
    Abou Risk, Nicholas
  • Games have been used to evaluate and advance techniques in the field of Artificial Intelligence since before computers were invented. Many of these games have been deterministic perfect information games (e.g. Chess and Checkers). A deterministic game has no chance element and in a perfect information game, all information is visible to all players. However, many real-world scenarios involving competing agents can be more accurately modeled as stochastic (non-deterministic), im- perfect information games, and this dissertation investigates such games. Poker is one such game played by millions of people around the world; it will be used as the testbed of the research presented in this dissertation. For a specific set of games, two-player zero-sum perfect recall games, a recent technique called Counterfactual Regret Minimization (CFR) computes strategies that are provably convergent to an ϵ-Nash equilibrium. A Nash equilibrium strategy is very useful in two-player games as it maximizes its utility against a worst-case opponent. However, once we move to multiplayer games, we lose all theoretical guarantees for CFR. Furthermore, we have no theoretical guarantees about the performance of a strategy from a multiplayer Nash equilibrium against two arbitrary op- ponents. Despite the lack of theoretical guarantees, my thesis is that CFR-generated agents may perform well in multiplayer games. I created several 3-player limit Texas Hold’em Poker agents and the results of the 2009 Computer Poker Competition demonstrate that these are the strongest 3-player computer Poker agents in the world. I also contend that a good strategy can be obtained by grafting a set of two-player subgame strategies to a 3-player base strategy when one of the players is eliminated.

  • Subjects / Keywords
  • Graduation date
    Fall 2009
  • Type of Item
  • Degree
    Master of Science
  • DOI
  • License
    This thesis is made available by the University of Alberta Libraries with permission of the copyright owner solely for non-commercial purposes. This thesis, or any portion thereof, may not otherwise be copied or reproduced without the written consent of the copyright owner, except to the extent permitted by Canadian copyright law.