DeepMind’s AlphaZero AI defeats rival computer program Stockfish 8 after learning the game in just four hours
AlphaZero, the game-playing AI created by Google-owned DeepMind, emerged victorious at chess against world-leading specialist software after having taught itself how to play the game in less than four hours.
The firm’s DeepMind division says that the repurposed AI played 100 games against the world champion chess program, Stockfish 8, and won 28 of them and drew the remaining, according to a non-peer-reviewed research paper published with Cornell University Library’s arXiv. Released in 2008, Stockfish 8 has previously won 2016’s Top Chess Engine Championship. In fact, Stockfish 8 has already been defeated in chess by another program, Komodo, in two major challenges this year.
“Starting from random play, and given no domain knowledge except the game rules, AlphaZero achieved within 24 hours a superhuman level of play in the games of chess and shogi [a similar Japanese board game] as well as Go, and convincingly defeated a world-champion program in each case,” said the paper’s authors that include DeepMind founder Demis Hassabis, who was a child chess prodigy reaching master standard at the age of 13.
“It’s a remarkable achievement, even if we should have expected it after AlphaGo,” former world chess champion Garry Kasparov told Chess.com. “We have always assumed that chess required too much empirical knowledge for a machine to play so well from scratch, with no human knowledge added at all.”
Ever since IBM’s Deep Blue supercomputer defeated Kasparov on May 12, 1997, computer programs have been able to beat the best human chess players.
It is AlphaZero’s machine learning ability that has gained the world’s attention, which is, it is given no human input besides the basic rules of chess, according to DeepMind. Other than this, it just plays itself over and over with self-reinforced knowledge. The result is that AlphaZero took an “arguably more human-like approach” to the search for moves, processing around 80,000 positions per second in chess compared to Stockfish 8’s 70m.
AlphaZero also learned shogi in two hours before beating the leading program Elmo in a 100-game matchup. AlphaZero won 90 games, lost eight and drew 2.
Demis Hassabis, co-founder and CEO of DeepMind said at the time of unveiling a new version of AlphaGo Zero in October had said then: “It’s amazing to see just how far AlphaGo has come in only two years.
“AlphaGo Zero is now the strongest version of our program and shows how much progress we can make even with less computing power and zero use of human data.”
Experts have already suggested that the achievement will strengthen the firm’s position in a competitive sector.
“From a scientific point of view, it’s the latest in a series of dazzling results that DeepMind has produced,” the University of Oxford’s Prof Michael Wooldridge told the BBC.
“The general trajectory in DeepMind seems to be to solve a problem and then demonstrate it can really ramp up performance, and that’s very impressive.”
The new generalized AlphaZero was also able to beat the “super human” former version of itself AlphaGo at the Chinese game of Go by winning 60 games and losing 40 games, with just eight-hours of self-training.