Martin Robbins on 16th March 2016 reports: The NHS is a much bigger challenge for DeepMind than Go see previous NHSreality post “gaming the system”.
Training a computer to play Go is an impressive achievement, but AlphaGo may be a long way from being a useful product.
People have a weird obsession with games likes Chess and Go. Achievement in them has long been seen as a marker of human intellect, and yet they’re among the least human test you could devise; putting players in simplified situations where everything is known, every possible course of action is laid out for them, and the test is one of concentration and logic.
We pass far greater tests daily, when we recognise a face in a crowd, when we dynamically balance in motion, when we predict the response our words and expressions will have on another sentient being, or when we do all of the above, effortlessly, at the same time. We don’t think of these as challenging because they’re so innately human, while playing Chess or Go seems far more impressive precisely because they’re more rigid and computational in nature.
There’s an irony in making a board game one of the ‘grand challenges’ of AI, and it surprises me that more people don’t see it. As DeepMind passes Go and sets its sights on other challenges, like transforming the NHS, there are some big open questions to be answered. How meaningful is it for a machine to play Go? How useful is the ‘deep learning’ technology that made it possible, really?
For most people, the answer may be ‘not very’. However impressive this system is (and it really is), there are a truck-load of non-trivial problems to deal with before you could even think about using its offspring in the wild.
One of them is neatly illustrated by an example from Game 2, recounted in Wired. AlphaGo played a move so odd that it left experts perplexed. One of the few who claimed to understand it was the European Go Champion, Fan Hui, who had worked with DeepMind to help test the program. As Wired tell it:
“Like the commentators, he initially didn’t know what to make of the move. But after about ten seconds, he says, he saw how the move connected with what came before—how it dovetailed with the 18 other black stones AlphaGo had already played.
“The average human will never understand this move. But Fan Hui has the benefit of watching AlphaGo up close for the past several months—playing the machine time and again. And the proof of the move’s value lies in the eventual win for AlphaGo.”
You’ve spotted the post hoc fallacy here, right? Just because the algorithm won the game doesn’t make this a great move. There’s a clear bias at play among the human observers, where a particularly novel move by the computer attracts attention, and is therefore given a significance that may not be warranted. There were other occasions when spectators felt the machine had made a mistake, because it had done something they wouldn’t do, and yet it still won.
In truth we don’t know why AlphaGo played that move, or any other. That’s not because the software is in some way ‘magical’ or ‘sentient’ as an infuriating number of writers seem to think, just that the mathematical models at the heart of the algorithm are so abstract and complex that you can’t really interpret what they’re doing in human terms.
That’s not a problem in a game of Go, but now imagine that the same technology were used to, say, diagnose cancer from CAT scan images. You’re not going to go cutting into someone unless you’re fairly sure about your diagnosis, but how can you be sure if you don’t understand the justification? And who’s responsible if you get it wrong?
There are two basic solutions. A human expert could back up the AI, but that raises all kinds of questions about how the two would interact. If any DeepMind staff are reading this, there’s a great experiment to be done on where human-AI teams rank against humans on their own or AIs or their own. Would we get the best of both worlds, the worst, or something in between?
Alternatively, people would have to trust the AI on blind faith. In that case our reliance turns the AI into a sort of medical magical eight ball. That’s fine until something inevitably goes wrong – imagine the consequences if an algorithm caused the next major oil spill or plane crash, and the company involved had literally no explanation for why it happened?
Let’s imagine we can get around all that. The next big problem is training the AI in the first place.
The job of a machine learning algorithm is basically to build a model, or a collection of models, that describe the data you train it on. You take a load of data about your problem, and you feed it into a system. The system chews through that data, extracting the patterns and trends in it to construct a model, a kind of simplified understanding of how the world works. The more complex and detailed the model is, the more high quality data the algorithm needs to construct it.
Deep learning algorithms can produce huge, multi-layered models with millions of variables, which means they need a vast amount of data to train on. That’s why they’ve only really taken off in the last few years. Some of the algorithms have been around for a while – convolutional neural nets are, like me, a product of the 1980s – but it’s only recently that we’ve had both the processing power and the data to train them.
Even now, both of those things are a challenge. Google’s François Chollet argued on Twitter that, “ ‘That Go win is not so impressive because it uses so much compute power’ is not a smart thing to say. It will be in your pocket in 2 years.” DeepMind’s founder, Demis Hassabis, pointed out that “Our neural network training algorithms are more important to #AlphaGo performance than the hardware it uses to play.”
It’s not that what they’re saying isn’t true, but they’re glossing over some big problems. Yes, once you’ve trained AlphaGo it’s pretty efficient to use – you could probably deploy it on a phone in a couple of years. But the computational challenge isn’t using the model, it’s training it in the first place. AlphaGo took three weeks to train across 50 GPUS. Without those resources, courtesy of Google’s deep pockets, last week’s milestone would still be months away. Even with all that power available, Hassabis suggested in a recent interview that training the system completely from scratch could take a ‘few months’.
In the past we might have assumed that computers would keep getting exponentially faster and solve the problem for us, but that’s no longer the case. The days when the latest and greatest achievements in computing are ‘in your pocket in 2 years’ are numbered. We can scale things up by adding more machines, but that gets expensive, and after a certain point you tend to see diminishing returns.
Then you hit the next problem: data. There’s a reason why deep learning has clustered around the likes of Amazon, Netflix, Google and Facebook: few other companies have access to the same volume of clean, high quality data.
Most business problems aren’t as simple as Go. You can’t just generate gigabytes of perfect information by playing the game over and over again – you have to collect messy data from the real world. It’s not uncommon to find major utility companies operating vast networks of equipment packed with sophisticated sensors that produce terabyte after terabyte of total garbage.
Even when companies do have data, it’s often trapped behind layers of bureaucracy, poorly managed, or unvalidated. Training and deploying machine intelligence is a difficult art, and few organisations have the talent needed to work on it. The majority of businesses I’ve worked with over the last ten years were barely in a position to do simple statistics, never mind deep learning.
DeepMind’s new collaboration with the NHS (who aren’t being charged) is a great example of the problem. Over-excited tech journalists – is there any other kind? – have talked about neural nets and artificial doctors and god knows what else, but Hassabis, to his great credit, was careful to be more realistic in a recent interview with The Verge:
“NHS software as I understand it is pretty terrible, so I think the first step is trying to bring that into the 21st century. They’re not mobile, they’re not all the things we take for granted as consumers today. And it’s very frustrating, I think, for doctors and clinicians and nurses and it slows them down. So I think the first stage is to help them with more useful tools, like visualizations and basic stats. We thought we’ll just build that, we’ll see where we are, and then more sophisticated machine learning techniques could then come into play.”
This is a business model that only really works between a tech giant like Google and a monolithic state entity. No conventional start-up could afford to take on such a project, with no revenue and little prospect of quick results. Few would even be allowed to bid for it.
DeepMind’s NHS collaboration is a big test for deep learning as a field. Toys and games and fun image processing applications are all very cool, but if these techniques can’t succeed at a large-scale data-rich enterprise, with Google’s financial and technological might behind it, then investors are entitled to start asking some awkward questions, like when and where exactly can they succeed?
We’ve been here before. In 2011, IBM’s ‘Watson’ was able to play and defeat humans at Jeopardy. The tech giant poured billions of dollars into the technology amid predictions of global technological revolution, only for the effort to fall a little flat. Five years on the hype remains, but results haven’t lived up to expectations.
The two systems aren’t directly comparable – DeepMind set out from the beginning to create a more general-purpose intelligence, whereas IBMs approach relied more on brute-force engineering to solve a specific problem. The test is the same though – can you take this exotic beast, and turn it into an real, useful product?
That’s the biggest question in deep learning today, and it’s a far greater challenge than playing Go.