The Building Blocks of Interpretability

Exploring how feature visualization can combine together with other interpretability techniques to understand aspects of how networks make decisions.

The Building Blocks of Interpretability
Image: Google Research

Earlier in 2015, Google researchers wanted images led to psychedelic images. They thus created code as DeepDream and it grew into a small art movement producing all sorts of amazing things. Continuing with an original line of research behind DeepDream, they also trying to address one of the most exciting questions in Deep Learning: how do neural networks do what they do?

Researchers have reported about their demonstration in journal Distill, on how those same techniques could show what individual neurons in a network do, rather than just what is “interesting to the network” as in DeepDream. This enabled them to perceive how neurons amidst the system are locators for a wide range of things — catches, patches of fabric, structures — and perceive how that development to be increasingly complex over the layers of the system.

The Building Blocks of Interpretability
Image: Google Research

Now, they have explored how include perception can consolidate together with other interpretability systems to comprehend parts of how arranges to decide. Through this, they demonstrate that these mixes can enable us to kind of “remain amidst a neural system” and see a portion of the choices being made by then, and how they impact the last yield.

They additionally explored techniques for understanding which neurons fire in the network. The techniques make things more meaningful to humans by attaching visualizations to each neuron, so we can see things like “the floppy ear detector fired”. It’s almost a kind of MRI for neural networks.

Visualizations of neurons in GoogLeNet. Neurons in higher layers represent higher level ideas.
Visualizations of neurons in GoogLeNet. Neurons in higher layers represent higher level ideas.

During experiments, they trailed the techniques on an image. They zoom out and show how the entire image was “perceived” at different layers. This allows them to really see the transition from the network detecting very simple combinations of edges, to rich textures and 3d structure, to high-level structures like ears, snouts, heads, and legs.

These bits of knowledge are energizing independent from anyone else, yet they turn out to be significant all the more energizing when we can relate them to an official choice the system makes. So not exclusively would we be able to see that the system identified a floppy ear, yet we can likewise perceive how that builds the likelihood of the picture being a labrador retriever.