The complexity of the nervous system is so great, its various association systems and cell masses so numerous, complex, and challenging, that understanding will forever lie beyond our most committed efforts.
—Santiago Ramón y Cajal,
Histology of the Nervous System of Man and Vertebrates
Ramón y Cajal was a great scientist but seemingly not much of an optimist when it came to neuroscience achieving its highest aim: to understand each component of the nervous system and how they together give rise to behavior.
To be fair, he was there in the earliest days of our field, looking upon the vast, rather complex frontier ahead. Ramón y Cajal and his contemporaries were just beginning to realize how complex the nervous system really is. They spent hours looking through the microscope, cataloguing the seemingly endless numbers of different neurons in various brain parts. Ramón y Cajal drew horizontal cells from the retina, pyramidal cells in cortex, and Purkinje cells in the cerebellum. He saw all of these cells and thought—wow, this is incredibly complex. He couldn’t conceive how we’d ever understand it because he couldn’t imagine the types of tools we’d have to do so.
We’ve come a long way, but the more pertinent question to you as an interested participant in our field is: Where are we headed? If you ask neuroscientists in the field, there are a few major themes.
The mind is now open to us in ways that exceed the wildest dreams of poets and philosophers. Why not peer inside?
—Steven Johnson, Mind Wide Open
In 2013, the president of the United States said “neurons” in an official White House speech. I, for one, was thrilled. Barack Obama was giving a speech to launch the BRAIN Initiative, which he motivated by the fact that we were still in the (relatively) early stages of understanding how the brain works. He pledged that the BRAIN Initiative would change that by “giving scientists the tools they need to get a dynamic picture of the brain in action.” The BRAIN Initiative specifically offers funding for projects that would expand our ability to noninvasively measure brain activity or genetically modify the brain. It has funded many big, important projects, but perhaps the most significant aspect is a recognition that neuroscientists needs better tools to study the brain, particularly in humans.1
Many other countries have followed suit. Neuroscience research centers in Japan have placed a focus on nonhuman primate research, specifically with marmosets. In 2014, the Japanese Science Ministry launched a project known as Brain/MINDS to advance the use of marmosets for neuroscience research.2 In 2016, China began a large initiative called the China Brain Project.3
The effort to build tools to study the brain is significantly dependent on our computing power. Each year, computers get faster and devices get smaller. Moore’s Law, the idea that computing power doubles each year, also applies to scientific research. More computing power translates to an amplified ability to collect and analyze more and more data.
In recent years, due to the BRAIN initiative and the overall recognition of our need for tools, the technology in our field has advanced significantly. Technology has revolutionized neuroscience in multiple ways:
■ We’re simultaneously recording from more parts of the brain.
■ We are performing more complex experiments that bridge neural circuits and behavior.
■ We’re standardizing methodological approaches, which can mean homing in on a smaller set of model organisms.
■ We’re compiling bigger and bigger datasets.
■ We’re analyzing our data with more sophisticated algorithms.
■ We’re using supercomputers to model neural activity.
Ultimately, these advances bode well for our ability to understand how the brain works. But as with any change, there are growing pains.4 Many researchers worry that we’re too focused on improving our technology, and that we instead should be thinking about how we ask our questions. There are plenty of questions we can answer with the current technology. But new tools are like new toys, and no one wants to get stuck recording from ten neurons when you could be recording from a hundred.5
Historically, recording from one cell at a time helped us make major leaps in the field. In the mid-twentieth century, David Hubel and Torsten Wiesel recorded from single cells of a cat’s visual cortex to determine that our brain interprets the world in single lines before it can build shapes and objects. Rare recordings from human patients have taught us a remarkable amount, including that someone could have a neuron that really likes Jennifer Aniston’s face.6 Single-cell recordings are still useful for certain experimental questions—for example, if you need detailed information about the changes in one neuron’s membrane potential. However, it is likely that there is more behaviorally-relevant information in circuits of neurons rather than each one individually.7
As a result of more technology, and therefore an increase in the depth and complexity of our data, neuroscience is getting a little bit big for its twentieth-century training bike.
If you want to know how the brain works, you need tools that can record from as much of the brain as possible and with as much detail as possible. As a result, neuroscience, like many other fields of biology, is getting bigger. We’re collecting more and more data at unprecedented rates—terabytes upon terabytes of images and recordings stored on hard drives and servers at labs across the globe. If you happened to be looking for a bunch of hoarders, well, you hit the goldmine with neuroscientists.
We’re already generating a ton of data in separate labs, but many neuroscientists think the pace of our data collection will continue to accelerate, even earning the “big data” label—basically, any data that is bigger than a standard hard drive. Why do we have so much data? Largely because we’re getting better at collecting it. The resolution of our microscopes has gone up, the cost of sequencing has gone down, and our ability to record from and manipulate large swaths of neurons is now being fully realized. We’re obtaining bites of data from every single gene, protein, neuron, and brain area, from both normal brains and atypical brains. Organizations like the Allen Institute for Brain Science alone generate insane amounts of data, enough for probably an entire generation of PhD students.
Let’s say you image an entire mouse brain (about the size of the distal phalanx of your thumb) with a technique known as electron microscopy. This technique shoots electrons at the tissue and collects the ones that bounce back, giving us an idea of how dense the tissue is at a certain location. When you scan the electron beam back and forth, you can get an image of a brain chunk with nanometer resolution—allowing you to visualize subcellular structures like mitochondria. If you do this for an entire mouse brain you’d end up with about 500 petabytes of data.8 Not surprisingly, the labs that do this don’t image an entire brain—they image small chunks at a time, and there’s more than enough to keep us busy in just a tiny brain chunk.9
We’re not only getting a wider look at the brain but also a deeper, complex, more comprehensive look. In fact, some have argued that the main benefits of our technology revolution will be information-rich datasets that can tell us a lot about one specific brain circuit or behavior.10 For many questions, recording high-quality data from hundreds of neurons simultaneously may be necessary.
For example, the cells in our primary visual cortex that respond to lines also have very specific preferences for the orientation of those lines. If you drift the lines in one direction, or change the size of the spaces between them, you’ll also find that cells also care about speed and resolution. Until 2019, we didn’t know if they had specific preferences for color, too. When you have to find the specific orientation, speed, and spatial preferences of a particular cell, that’s already a huge stimulus set; add different colors of those lines, and it’s even bigger. But, if you do this while imaging hundreds of neurons at a time, you’re more likely to find neurons that will respond. With this approach, researchers at the Salk Institute discovered that neurons in our primary visual cortex do care about color—textbook information that wouldn’t have been attainable if it weren’t for the ability to record hundreds of neurons simultaneously.11
The types of data we are collecting are also becoming more diverse. For example, someone might take images of neural activity while also recording the electrical activity of a cell, which would give us both images and electrical traces to work with. Other researchers are collecting neural recordings while animals are performing a task, and they are also relating these recordings to the connectivity or cell types of their cells. Each of these datasets in themselves is really rich. Neuroscience experiments increasingly come in multiple formats and contain heterogenous types of data—one of the challenges on our horizon is standardizing imaging, behavior, and electrophysiology file formats so that we can more readily analyze and share them.
The fields of astronomy and physics understand what it means to collaborate with big experiments and correspondingly huge datasets, largely because of the nature of their work.
If you want to know about space, you need a huge telescope operated by an entire team of engineers. The Hubble Space Telescope collects about a terabyte of data each year that is analyzed by numerous labs internationally.12 All told, over 15,000 papers have been published with Hubble data.13
If you want to know how atoms collide, you need a big hadron collider. CERN’s Large Hadron Collider amasses insane amounts of data: about 30 petabytes per year.14 This data is openly available and analyzed by many physicists across the globe.15 In 2015, an enormous physics collaboration at the collider actually broke the record for the most authors on a scientific paper, with 5,154.16
For many years, physicists have recognized that big questions require big scientific collaborations, and neuroscientists are beginning to adapt this idea as well. Some argue that the field will inevitably move in this direction, such that fewer people collect data and many other people analyze it. According to Karel Svoboda, “There will be fewer data providers (experimentalists) and more folks doing data analysis, modeling and theory. This change will be dramatic. Ultimately the field will resemble astronomy more than classical biology.”17
There are a few examples of large, collaborative efforts that are currently underway in neuroscience. The Allen Institute for Brain Science is one such example. With a focus on the development, anatomy, and function of both mice and human brains, the Allen Institute for Brain Science is taking a large-scale, team-science approach to collecting, analyzing, and sharing various types of data.18 There’s also the International Brain Laboratory, an attempt to standardize an approach to studying decision-making in rodents. They’re developing a perceptual decision-making task for mice to perform. Then, they’ll record from a variety of different brain regions with a few different techniques and piece it all together to tell us how the brain decides.19
Not only do we have more and more data, but this data is also increasingly available to the public. I may be a bit of an idealist, but I do think that neuroscience is becoming increasingly more open, in terms of data sharing, open-access papers, and distribution of protocols and experiment designs.20 This is largely thanks to a big open access movement that believes that since taxpayers pay for scientific research, they should have access to the publications and data. For instance, many journals are now requiring that authors make their data publicly available once papers are published.
Big data, small labs, and the structure of science
It’s unclear whether this acceleration of data collection and increasing collaboration means that most of neuroscience is going to get bigger or that there will just be some larger collaborative working groups in addition to many small labs.
Dr. Justin Kiggins, a product manager at the Chan Zuckerberg Initiative with experience producing and distributing large-scale neuroscience data has pretty strong feelings about this:
Systems neuroscience is going to enter a true “big data” era. There will be major advances in electronics and imaging, increasing the sizes of populations recorded and enabling new multimodal datasets. However, these “bleeding edge” advances won’t simply trickle down to your average lab. While the cost of getting this data will become much cheaper, the operational cost of storing and managing data of such scale will make it very hard for single labs to take advantage of it. I see a growing disparity between glam labs that can afford the overhead costs of data management and everyone else.21
Others feel differently. Dr. Konrad Kording, an investigator at the University of Pennsylvania, thinks that thanks to open data, software, and hardware, the whole way that we do science is going to change from large labs that do everything from data collection to analysis, to specialized groups that collaborate:
A community of small, specialized labs can be far more productive than a smaller number of vertically oriented glam labs. A lab can be very successful by focusing on one aspect, and being really good at that. Take my lab—we barely ever run experiments or develop hardware, but we collaborate with people who do. We are being quite successful as a relatively small group, simply by not acting like a glam lab. We are a crucial piece of collaborations between often relatively small groups.
Lastly, perhaps there will be space for both small labs and bigger labs. This is the opinion of Florian Engert: “I propose that there is equal space and opportunity for both: corporate-style/industrial-size science as well as the individual, small-scale, cottage industry style.”22
Only time will tell how this all plays out—whether neuroscience continues to advance primarily in large, glam labs or in smaller specialized groups that are collaborating on big problems.
As a result of our ever-increasing data fervor, neuroscience is in need of people who can tackle large datasets. That means that you, even with a bit of coding experience, can do your own research projects. It also means that we need people who can move us from data, to information, to knowledge. Folks that can see the forest for the trees and guide data collection and analysis are going to be increasingly important in a data-rich neuroscience age.
Looking forward, there are a few opportunities that will arise here.
Designing large-scale projects
This is where the hard work is—in formulating precisely the question of what we actually want to know, what an answer would look like, and what kind of insight we can take away from the experiment.
—Florian Engert, “The Big Data Problem”
Big projects can cost a lot of time and money—without the right experimental and conceptual frameworks, they’ll just be big wastes of time. We need people (principal investigators or project managers) who can design experiments and plan data collection to ask pressing biological questions. For example, how do networks of thousands of neurons work together to generate an image of a corgi and drive our hand to go pet it? How do the 3.2 billion base pairs in our genome vary in individuals who have been diagnosed with schizophrenia or autism? Why do we have about as many neurons in our brains as there are sheep in Australia?23 Theorists who think deeply about conceptual frameworks of brain function are absolutely essential here.
Wrangling big data
Data is getting easier to collect but not necessarily easier to analyze. The second opportunity is that we need folks who can help organize and annotate datasets. These data come in diverse forms, from images, to MATLAB arrays, to R structures. Neuroscience needs people who understand these different forms and can wrangle them into useful formats that others can access. Many times, these types of people are called data scientists or data architects. As we’ll see in part 4, the ability to work with large datasets is a skill that will serve you in many fields beyond neuroscience, and it’s an increasingly valued skillset.24
Applying our theories about the brain to these data
One of the major thrusts has to be analysis of complete nervous systems at the level of anatomy, neural dynamics, and theory. In the next 15 years, we will have a complete accounting of the cell types of the brain; the connections between these cell types; the ability to measure activity in defined cell types at scale…We don’t yet have the conceptual framework on how to think about this data.
—Karel Svoboda
The third opportunity is that we need people who can analyze such datasets once they have been collected. Although thoughtful data collection will be guided by a set of questions, these datasets will very likely have more to offer. Computational and theoretical neuroscientists can also play a significant role in guiding data analysis and modeling possible threads in the data.
Traditionally, if you were a purely theoretical neuroscientist, you would need to ask individual labs to share their data. However, now there are more and more well-curated datasets out there for scientists to analyze without having a direct relationship with individual labs. So with the increasing open sharing of data, there is less of a barrier to entry for computational or theoretical neuroscientists.
Both data wrangling and analysis will require some knowledge of code and programming, and the latter will require a bit more math. Neuroscience, coding, and math have long been acquaintances, but they’re quickly becoming really good friends. These tasks will also require folks who can peruse data that they didn’t collect themselves and assess its potential. On the flip side, researchers who are providing open-source data should be ready to effectively communicate its potential to eager data miners.
It’s a keen idea to consider data science and analytics as viable career paths. In order to get there, it’ll be good to get friendly with coding yourself as early as you can. There’s more on this topic in part 3, as well as the data science section of part 4.