January 9, 2017 – In the latest artificial intelligence (AI) gaming foray, a system developed at Carnegie Mellon University (CMU) will test its acumen against a team of four of the world’s best professional poker players. This is the second attempt by a CMU AI to challenge the world’s best human poker players. In May 2015 in a No-Limit Hold’em contest the CMU entry named Claudico lost. But learning from defeat, CMU has built a new AI called Libratus and challenged poker’s human best 4 to a rematch. The event, called “Brains vs. Artificial Intelligence: Upping the Ante,” will take place at Pittsburgh’s Rivers Casino on January 11.
When Claudico lost in 2015, also at the Rivers Casino, the contest lasted 14 days and involved 80,000 hands and $170 million U.S. on the table. In the end the humans came out ahead a little over $700,000, almost a statistical tie. So CMU went back to the drawing board and two years later have their rematch begining January 11 and scheduled to last 20 days and 120,000 hands. The winners will receive a check for $200,000.
AI has been on a run when competing against humans at games, first with chess, then Jeopardy, and most recently Go. Trivia and board games, however, are different from poker which involves the element of the unknown and human emotions. How will Libratus know when an opponent is bluffing?
Derived from the Latin word that means balanced as well as powerful, Libratus’ algorithms are geometrically more advanced than Claudico. Where the latter derived its AI algorithms from 3 million hours of computation, Libratus’ programming has been derived from 15 million hours of effort.
Tuomas Sandholm, professor at Carnegie Mellon, describes the challenge Libratus faces.
“It requires a machine to make extremely complicated decisions based on incomplete information.”
Libratus applies the Nash Equilibrium solution to its poker winning strategy. Named after an American mathematician and game theorist, John Forbes Nash Jr., the concept according to Investopedia is described as follows:
“The optimal outcome of a game is one where no player has an incentive to deviate from his chosen strategy after considering an opponent’s choice. Overall, an individual can receive no incremental benefit from changing actions, assuming other players remain constant in their strategies.”
Libratus has learned to identify when a hand is promising and when to play on, versus not promising, and when to fold early. And unlike Claudico where its human opponents soon learned when it would bluff, Libratus is far more unconventional. It will make moves that Sandholm describes as “weird.”
Are there real world applications? Sandholm characterizes being able to respond to incomplete and misleading information as a fundamental skill needed in many fields from business to medicine. Nick Nystrom, Senior Director at the Pittsburgh Supercomputing Center, when asked about Libratus’ programming states, “Extending AI to real-world decision-making, where details are unknown and adversaries are actively revising their strategies, is fundamentally harder than games with perfect information or question-answering systems….This is where it really gets interesting.”