DeepMind Is Developing One Algorithm To Rule Them All

DeepMind wants to enable neural networks to emulate algorithms to get the best of both worlds, and it’s using Google Maps as a testbed. DeepMind is trying to combine deep learning and algorithms, creating the one algorithm to rule them all: a deep learning model that can learn how to emulate any algorithm, generating an algorithm-equivalent model that can work with real-world data.

The key thesis is that algorithms possess fundamentally different qualities when compared to deep learning methods – something Blundell and Veličković elaborated upon in their introduction of NAR. This suggests that if deep learning methods were better able to mimic algorithms, then generalization of the sort seen with algorithms would become possible with deep learning.

“The algorithm might give you a great solution, but there’s really no way of saying whether this human-devised heuristic is actually the best way of looking at it, and that’s the first point of attack.

Historically, we found that whenever there’s a lot of human feature engineering and lots of raw data that we need to work with for that human feature engineering, this is where deep learning shines. The initial success stories of deep learning were exactly in replacing handcrafted heuristics for, say, image recognition with deep neural networks.”

The network will map the complexity of the raw data to some ‘abstractified’ graph inputs that can then be used to run the algorithm over.

This research even figured out a way to propagate gradients through the algorithm so you can actually get good neural network optimization in the setting and apply the algorithm.

First of all, the algorithm may not be everything you need to compute the final solution.

“Forcing your representations to rely on the output of this one algorithm might be too limiting. Dijkstra requires one scalar value per every edge in the graph, which means that you’re compressing all that amazing complexity of the real world into just one number for every road segment.”

“That’s potentially problematic because, if you don’t have enough data to estimate that scalar correctly, you’re doomed. The algorithm is not going to give you correct solutions once again, because you just haven’t seen enough raw data to estimate that scalar correctly.”

The key idea in NAR is to replace the algorithm with a neural network, typically a graph neural network. Of course, this high dimensional GNN neural executor needs to actually imitate the algorithm at hand. The algorithm iterates on the value estimates and gradually improves them, until they converge to the ground truth values.

This is the algorithm the NAR team is trying to imitate, using synthetic data and simple graphs to estimate meta-values. Initially, the output might not be that good, but you don’t need that much data to get off the ground because the algorithm can deal with imperfect estimates when you don’t know the environment’s dynamics.

In this case, a shortest path algorithm was of interest. The algorithm can be learned and then it will be plugged in. The idea is that the RL framework should be usable for any algorithm, or combination of algorithms.

The approach Deac and the DeepMind duo worked on is called eXecuted Latent Value Iteration Network, or XLVIN. First, the value iteration algorithm is learned in the abstract space.

Previous theoretical work described a best case scenario, where there are settings of weights for a GNN that’ll make it nicely behave as that dynamic programming algorithm.

It’s a dynamic programming update and it’s closely related to a shortest path algorithm – it’s like an online shortest path algorithm.

That’s potentially one of the deep connections between graph algorithms, reinforcement learning and graph networks.

There’s algorithms ‘left, right and center inside the reinforcement learning pipeline,’ as Veličković put it.

So the question is, can we learn all of them? If you can have a single network that is able to execute any one of the algorithms that you know already, then if you get that network to plug those algorithms together, you start to form really quite complicated processing pipelines, or programs.

“If you really take this to its limit, then you start to really learn algorithms that learn. That becomes very interesting because one of the limitations in deep learning is the algorithms that we have to learn. There hasn’t been that much change in the best optimizers we use or how we update the weights in a neural network during training for quite a long time.”

“Learning algorithms are just algorithms, and maybe what’s missing from them is this whole basis that we have for other algorithms that we’re using. So we need a slightly more universal algorithm executor to use as the basis for better methods for ML.”.

They are doing some transfer learning, chaining a couple of algorithms together and seeing if they can transfer between one algorithm, making it easier to learn a separate related algorithm, she said.

Or in other words, as Veličković framed what everyone seems to view as the holy grail of this research: “One algorithm to rule them all.”