In most deep learning networks, every neuron in the network performs one computation (or more, if the network is recurrent) for every input sample (e.g., digital image). Real brains are different. Different informational elements of the input sample are extracted by neurons in “early” parts of the network, and differentially routed to spacially distinct “later” parts of the network. The routes taken depend on the input sample but are also driven by “top-down” influences–e.g., the focus of attention of an animal, it’s current task of interest, or it’s physiological state (alert, resting, sleeping).

As an example, neuroscientists have discovered that when an animal attends to an object in a visual scene, “what” information is preferentially routed along a ventral pathway through the occipital and temporal lobes of the cortex (purple, in image below), while “where” information is routed along the dorsal pathway through the occipital and parietal lobes (green).

In other words, distinct cortical regions of the network have specific processing roles and, as a whole, the network not only converts sensory inputs into useful information representations (for use by other brain areas or for activation of motor outputs), but determines where that information should be sent. Regions of the network that are not useful for a given task are not activated unless needed, reducing overall network activity and energy consumption. Facial imagery information is routed to brain regions that discriminate among different faces. Location imagery information (e.g., roads or houses) is routed to brain regions that discriminate among locations. In general, there is no reason to send face information to the location processing region, and vice versa.

How might such efficient routing be enabled in deep learning networks? Note that routing and gating are two sides of the same coin. In order for information to be routed, connections between groups of transmitting neurons and groups of receiving neurons have to be gated on or off, to dynamically implement the routing.

Binary Cognition is working with a novel network architecture in which such groups of neurons, referred to as banks, are connected by an intermediate gating structure that is purely feed-forward. This allows for gates that can be introduced into traditional deep learning feed-forward networks. Very few computations are needed to operate the gates, and overall computational savings can be substantial since many of the gates in the network will be turned off for a given input (and thus computation is not needed in downstream banks). Binary Cognition continues to develop proprietary training methods for these networks, as the traditional method, back-propagation, is insufficient.

The fundamental gating element of a purely feed-forward network with dynamic routing. A single neuron (circle in the gating structure) receives input from neurons in the source bank. The gating neuron’s unit-step activation function controls whether or not the gate is opened to allow activation of neurons in the target bank. There is a unique gate for each pair of source and target banks in two connecting layers.

As conceptualized in the figure below, information is routed to banks of neurons that can facilitate the necessary task, and is not routed to other banks. For example, low-level features that indicate the presence of a furry mammal in the image activate the routing of these very same features to banks of neurons that can discriminate between dogs, cats, etc. Were the low-level features instead indicative of a human face, the features would be routed to banks of neurons specialized for facial discrimination.

In this illustration, three different images are processed by the same network. If all gates were open, the network would be fully connected, with all banks (concentric circles) in layer L connected with all banks in layer L+1, and all banks in penultimate layer connected with all single neurons (circles) in the final, classifying layer. Due to the gates between layers (not shown) only a subset of banks and neurons are activate (colored triangles) for a individual input image. Output classes can be arranged by a similarity score to minimize the total number of banks that are active for individual images, on average. The cat and fox image activate many of the same banks, due to their visual similarity. In contrast, the Oprah Winfrey image activates many different banks, particularly in later layers.

Binary Cognition is also working toward gating structures that take feed-back input as well as feed-forward input. This will allow for top-down guidance of routing — e.g., directing attention to a specific area of a scene, or limiting processing to that which is pertinent for a desired task such as searching for only red objects in an image, regardless of the identity of those objects. In some ways this is similar to capsules — a network element introduced by Geoffrey Hinton and colleagues [1]. A key difference is that capsules accommodate one-to-one routing whereas our formulation accommodates one-to-many routing, when needed (e.g., differentially but simultaneously routing what and where information about an object).

REFERENCES

  1. Sabour, S., Frosst, N., & Hinton, G. E. (2017). Dynamic Routing Between Capsules. http://arxiv.org/abs/1710.09829.