Back in January AlphaGo’s 5-0 defeat of Fan Hui, the reigning European champion, was a shot out of left field. Go players and AI developers alike believed that we were still some 10 years away from that feat ever occurring and the resounding defeat of a champion was rather unexpected. However many still expected the current long time champion, Lee Sedol, to come out on top given his much higher ranking. The battle was set to be decided in the same game format, 5 games with 2 hours of time for each side. Over the last week AlphaGo and Lee Sedol have been facing off in match after match and AlphaGo has emerged victorious, winning 4 out of the 5 games.
Just like when Kasparov lost to Deep Blue AlphaGo’s victory has sent ripples through both the computing and Go communities. For technologists like me it’s a signal that we’ve made another leap forward in our quest for strong AI as we’ve developed better methods for training neural networks. The Go community is less enthusiastic about it however, coming to terms with the fact that not even their game of choice is beyond AI’s capabilities. What is interesting to see however is the conversation around AlphaGo’s style of play and the near universal idea that it has some fundamental weaknesses that top Go players will look to exploit.
Indeed Lee Sedol’s one win against AlphaGo shows that it’s no where near being the perfect player and its play style needs refinement. It seems that AlphaGo tends to calculate the most advantageous moves for both itself and for its opponent, using this as the basis for judging its future moves. However unexpected moves, ones that were pruned out of its search tree due to them being sub-optimal for its opponent, seem to throw it for a loop. This is similar to how Kasparov initially beat Deep Blue, playing moves that sent it down a non-optimal search path before making his own, far more optimal, moves. Whether or not this can be developed into a viable strategy is something I’ll leave up to the reader, but suffice to say I don’t think it’d remain a weakness for too long.
For some though Lee Sedol’s loss is merely a symbolic one as the real current champion is Ke Jie, who has a 8-2 record against AlphaGo’s last opponent. Whilst I can’t really comment on how much better of a player he is (I don’t follow Go at all) AlphaGo’s almost 5-0’d Lee Sedol and I’m sure it’d give Ke Jie a solid run for his money. I’m sure AlphaGo will continue to make appearances around the world and I’m eager to see if it can still come out on top.
One interesting thing to note is that AlphaGo did receive a little boost in computing power when facing off against Lee Sedol, getting another 700 CPUs and 30 GPUs to handle the additional calculations. However that extra hardware might not have been strictly required as the AlphaGo team has said that a single laptop version can beat their distributed one about 30% of the time. Regardless it seems the AlphaGo team thought Lee Sedol was going to be a much tougher challenge than Fan Hui and gave their AI a little boost just to be sure.
The AlphaGo team won’t be resting on their laurels after this however as they’ve got their sites set on bigger challenges, like StarCraft. I’m very much looking forward to seeing them attempt the not-so-traditional games as I think they’re a far more interesting challenge with many more potential applications.
Computers are better than humans at a lot of things but there are numerous problem spaces where they struggle. Anything with complex branching or large numbers of possibilities forces them into costly jumps, negating the benefits of their ability to think in microsecond increments. This is why it took computers so long from beating humans at something like tic-tac-toe, a computationally simple game, to beating humans at chess. However one game has proven elusive to even the most cutting edge AI developers, the seemingly simple game Go. This is because unlike chess or other games, which often rely on brute forcing out many possible moves and calculating the best one, Go has an incomprehensibly large number of possible moves making such an approach near impossible. However Google’s DeepMind AI, using their AlphaGo algorithms, has successfully defeated the top European player and will soon face its toughest challenge yet.
Unlike previous game playing AIs, which often relied on calculating board scores of potential moves, AlphaGo is a neural network that’s undergone whats called supervised learning. Essentially they’ve taken professional level Go games and fed their moves into a neural network. Then it’s told which outcomes lead to success and which ones don’t, allowing the neural network to develop it’s own pattern recognition for winning moves. This isn’t what let them beat a top Go player however as supervised learning is a well established principle in the development of neural networks. Their secret sauce appears to be a combination of an algorithm called Monte Carlo Tree Search (MCTS) and the fact that they pitted the AI against itself in order for it to get better.
MCTS is a very interesting idea, one that’s broadly applicable to games with a finite set of moves or those with set limits on play. Essentially what a MCTS will do is select moves at random and play them out until they’re finished. Then, when the outcome of that play out is determined, the moves made are then used to adjust the weightings of how successful those potential moves were. This, in essence, allows you to determine what set of moves are most optimal by refining down the problem space to what is the most ideal set. Of course the tradeoff here is between how long and deep you want the network to search and how long you have to decide to make a move.
This is where the millions of games that AlphaGo played against itself comes into play as it allowed the both the neural networks and the MCTS algorithm to be greatly refined. In their single machine tests it only lost to other Go programs once out of almost 500 games. In the match played against Fan Hui however he was matched against a veritable army of hardware, some 170 GPUs and 1200 CPUs. That should give you some indication of just how complex Go is and what it’s taken to get to this point.
AlphaGo’s biggest challenge is ahead of it though as it prepares to face down the current top Go player of the last decade, Lee Sedol. In terms of opponents Lee is an order of magnitude higher being a 9th Dan to Fan’s 2nd Dan. How they structure the matches and their infrastructure to support AlphaGo will be incredibly interesting but whether or not it will come out victorious is anyone’s guess.