What Remains of the Mysteries of the Brain?

3 hours ago 1

(Note regarding the tone: this was originally written with the intent of being published as a perspective in an academic field journal, which is why the writing might feel, well, academic.)

A decade ago, intelligence was a mystery. We did not know how to build a system that could reason and speak like a human, generate artistic images, or compose and produce listenable music. I entered the field of computational neuroscience during this period, having studied computer science as an undergraduate, because I thought understanding the brain could help us understand intelligence at a computational level. At the time, the brain, especially the human brain, was the only thing in the world that we considered “truly” intelligent. By unlocking the secrets of how the brain functioned, I thought, we could pave the way toward artificial intelligence.

Now we stand at the other end (or after the beginning, at any rate) of the AI revolution, and the problem of intelligence has been all but solved. Large artificial neural networks with transformer architectures, trained on vast quantities of data, have reached parity with or surpassed human ability on many axes of cognitive achievement once thought to be the sole purview of our species. There is still room for improvement, to be sure, but it does seem that we have figured out the basic ingredients of intelligence: neural units that apply some nonlinearity to the weighted sum of their inputs, a lot of data, a learning algorithm, an objective function, and “moar layers”.

While it is still possible that the brain operates on totally different principles than modern artificial intelligence, the zeitgeist in theoretical neuroscience has been steadily moving toward the direction of connectionism – the view that the brain basically functions like some sort of artificial neural network. There are several converging lines of evidence for this view. Artificial neurons and neural networks were initially conceived as an explanatory description of how the brain computes, and the biophysics of biological neurons seems to allow for computations at least as computationally powerful as artificial neurons. Learning in artificial neural networks, which changes the weights of connections between neural nodes, seems to qualitatively mirror synaptic plasticity in the brain. And the impressive performance of artificial neural network-based AI models in the past few years on cognitive tasks once thought to be only achievable by humans gives additional credence to the view that artificial neural networks may be good models for understanding human cognition.

What questions remain, then, for theoretical neuroscience? If we are indeed on the cusp of being able to replicate all facets of human intelligence on silicon chips, what more do we hope to gain from further studying computation in the brain? (It goes without saying that there are still many advances that need to be made in terms of solving various psychiatric and neurological disorders; my intention here is regarding the computational approach to the study of the healthy brain.)

Broadly speaking, there are two directions the theoretical neuroscientist can take. One avenue is to look at areas where modern AI fails or still does poorly relative to humans. Given the rapid pace of AI advancement, however, things that seem like fundamental failure modes of AI this year might be solved next year, often with no particular need for insights from the brain – AI improvement is often just a question of more layers, more compute, and more data. The other avenue for brain research is to focus on those things that we know that mechanistically the brain does differently than AI, as an alternative paradigm for neural computation. These alternative paradigms may end up being similar to modern AI architectures in some fundamental mathematical way, but understanding the nature of these equivalences themselves is an essential step toward a full understanding of the brain.

With respect to arenas in which man is still superior to machine: one commonly-made observation is that people are able to learn from much less data than AI agents, possibly indicating that human brains are better learners than current AI systems. Frontier LLMs, however, do seem to be able to engage in “one-shot” adaptation to novel tasks, indicating that fast learning might be an emergent property of having a robust world model rather than a consequence of specific choices of network architecture or learning algorithm. Although it might appear that LLMs require more data than humans to get to this stage, this could simply be an instance of evolutionary pre-training. Just like an LLM trained to understand one language has a much easier time learning a second language, evolution might have “pre-loaded” information into the synaptic connectivity of the brain, giving us a big head start when it comes to learning new things relative to AI. With the advent of improvements in the acquisition of electronic microscopy images of large brain volumes, neuroscientists can begin to explore ‘nature vs. nurture’ questions about information storage in the brain by comparing connectivity structures across organisms.

Physical problem-solving in unpredictable environments, such as those presented in plumbing or housing construction, also presents unique challenges for artificial intelligence. These kinds of problems require highly multimodal integration of sensory information combined with planning at varying time scales and sensory-motor feedback loops as the agent learns to interact with the environment. This too, however, may just be an issue of procuring sufficient multimodal training data and computing power, not a qualitative limitation of AI relative to brains.

One area where the brain does seem to have unique advantages over AI is the ‘catastrophic forgetting’ concern, or more generally managing the tradeoff between accumulating information about a new task versus maintaining information about previous tasks. It is possible that the brain manages this problem via subcellular mechanisms by keeping some molecular trace of prior memories inside the cell even as synaptic weights change. The brain also seems to explicitly differentiate memory formation and long-term memory storage in the hippocampus and cortex, although plasticity can occur in both. The exact roles of the hippocampus and cortex in neural computation and memory, and the reason for their separation, is a critical direction for theoretical neuroscience research.

With respect to known mechanistic divergence between the brain and current frontier AI models, there are fundamental differences at every level of analysis of the brain, from the individual synapse to the macroscopic organization of the brain’s anatomy. Perhaps the most important of these differences is the gap between how plasticity works in the brain and how learning happens in artificial neural networks. Artificial neural networks learn via backpropagation, which requires information about errors made at the network’s output to be piped backwards, layer by layer, until the input layer of the network. The brain, by contrast, doesn’t seem to have any mechanism to do this long-range communication of error signals. The brain instead has to rely on pairwise local mechanisms, like Hebbian plasticity or STDP, global signals, such as neuromodulation, or dedicated circuit supervisory signals, as observed in the entorhinal cortex with BTSP, or with climbing fibers in the cerebellum. There have been some theoretical studies exploring how backpropagation can be replaced with more local learning rules, but these rules need to move beyond a proof of concept stage to demonstrate that they are indeed viable for large-scale learning tasks. It is also necessary to validate that the proposed backprop-free learning rules are, in fact, how the brain operates; the experimental validation of proposed learning rules is still in its infancy, especially when it comes to moving beyond simplistic associative learning behavioral paradigms to more complex and realistic tasks.

The overall architecture of the brain is also quite different from what is presently used in state-of-the-art AI models. Large language models, as well as many other modern AI systems, rely heavily on transformer blocks, which have no apparent direct analogue in the brain. The brain may have figured out how to do essentially the same thing as a transformer in a different way, but it is at least just as likely that the brain is doing something completely different.

A hallmark of the brain’s architecture, in contrast to most networks used in AI, is its pervasive recurrency. Whereas most artificial networks use an essentially feedforward architecture, in many areas of the brain, we observe extensive connections between neurons within the same processing layer, as well as projections to neurons that are not part of a clear linear processing hierarchy. The brain’s recurrence is not just an architectural feature; it fundamentally changes the brain from a static input-output machine to a constantly-changing dynamical system with complex activity happening at every instant, along multiple time scales. This dynamical aspect of the brain may be essential to its operation and potentially indicates a qualitatively different computational paradigm than networks with feedforward architectures.

To the extent that processing hierarchies do exist in the brain, we still don’t understand how many of them function. The cerebral cortex, even with its stereotyped 6-layer architecture, is still fundamentally a mystery. Why 6 layers? What does each layer do? How does information actually flow in the cortex? How does learning and plasticity work in the different layers? Understanding the computational function of the architecture of the cortex may be akin to understanding transformers in LLMs – that is to say, it would reveal the fundamental building blocks of human cognition. Similar explorations to understand the computational meaning of the canonical circuit architecture are necessary for every region of the brain.

Related to the question of the architecture of the cortex is the purpose of the pantheon of cell types that exist in the brain and are localized to particular brain regions, including cortical layers. Why are brain cells segregated into excitatory and inhibitory neurons, as opposed to allowing plasticity to determine the effect of each synapse individually? Do excitatory and inhibitory neurons have equal and opposite roles, or is there a deeper difference between the parts they play in the brain? Why do different neurons have different dendritic morphologies? And what is the computational role of dendrites? The bipartite structure of pyramidal neurons, whether they appear in the cortex, hippocampus, or elsewhere, strongly suggests that their role is to somehow integrate information across layers within a brain region, or between nearby brain regions - but what exactly is the nature of this information integration? Why did biology decide to take cells with this particular property and spread them all over the brain?

Zooming out to the level of brain regions, the brain seems to be structured in a more modular manner than a typical AI system, and it is not clear what each module does, or the principles underlying how different modules are meant to interact with each other. We do, of course, have a vague sense of the behavioral role of each anatomical region, but we still do not fully understand how those roles translate into a connectionist computational implementation. Moreover, we do not know which brain regions compute and learn in essentially the same way, and which modules operate on wildly different principles.

In early sensory systems, such as the visual system of the fly, the neural networks used to perform essential visual computations are engineered in a highly specific manner to accomplish a particular task, and are not necessarily emblematic of how “generic” computations in the brain work. On the other hand, generic learning and computation systems are clearly necessary for human cognition, or we wouldn’t be able to learn to do tasks that have no precedent in our evolutionary history, such as computer programming. But what is the rule and what is the exception? Is the brain a general purpose multimodal learning system with a few custom-designed bits, or is it mostly a collection of hyper-engineered, task-specific modules, one of which happens to be a general purpose learning system? Figuring out how all the pieces of the brain fit together into a cohesive whole is still a major task for neuroscience.

The AI revolution has produced - and will continue to produce - profound changes in our world and how we perceive our place in it. Now that humans are no longer the only entities in the world that can claim possession of ‘general intelligence’, we are forced to grapple anew with what it means to be human. As neuroscientists, we play a unique role in this struggle to define and differentiate ourselves from artificial intelligent systems, as it is our job to understand and explicate the human mind.

I have sometimes been asked to explain ‘on one foot’ how the brain works. I often answer that the brain is basically a deep artificial neural network, and the rest is commentary. Today, though, this is not sufficient. Artificial neural networks give us a vocabulary with which to understand computation in the brain and provide a powerful proof of concept that connectionism works. But we must go beyond this. Neuroscience needs to be able to give a clear answer with respect to both the similarities and the differences between AI and our brains. At least for the moment, the minds of man and machine have not yet converged.

Discussion about this post

Read Entire Article