Imagine every feature-length film that’s ever been made, in every language that’s ever been spoken, saved on a single server. You’d have a cinematic “superlibrary” comprising about four petabytes of data.
Now imagine having 150 tiny robots conducting and recording up to 2 million experiments per week in an effort to create a map of human biology and molecular interactions at the cellular level. That’s what Salt Lake City-based clinical-stage biotechnology company Recursion has been doing since its inception in 2013, and the effort has yielded a data set approaching 20 petabytes — five times larger than that cinematic superlibrary. The company has knocked out every gene in the human genome, in multiple human cell types, and profiled nearly a million molecules at multiple doses in a single cell type.
On episode 129 of the Business of Biotech podcast, Recursion Cofounder and CEO, Chris Gibson, Ph.D., related the company’s effort to create a map of human biology to Google’s ongoing effort to map every street on the planet. Having broad maps of biology is important, he says, because without them, it’s difficult to understand how everything is related at the cellular level. “I think we only understand 2% or 3% of biology as a species today,” he surmised.
At the end of all of those experiments, Recursion takes the same pictures, with the same camera, with the same stains, to ensure the data it’s recording are relatable. “Essentially, everything we do every week, at the base level of the company, is to build a map, and that map becomes the thing that our scientists navigate to novel relationships.” That “map” is dubbed the Recursion OS.
SO YOU’VE GOT BIG DATA. NOW WHAT?
With such a massive data set at its disposal, Recursion is undertaking massive sequencing initiatives, exploring proteomic data sets, and using machine learning to extract what Gibson says are ultra-sensitive and specific readouts of physiology and potential toxicity to guide its predictions about potential therapeutic molecules. “The second we get a picture of a potential drug on a human cell, could we make a prediction about what that drug might do in the context of an animal model? I think, if you get enough data, you’ll be able to make solid predictions in that way across the entire stack.” Fully acknowledging that no army of human scientists could make a meaningful dent in the analysis of that much data, Recursion employs machine learning (ML) to detect meaningful molecular interactions. “It’s what ML tools are really good at. They don’t make a drug, but they give us insights about the relationships across this huge complex data set that our scientists can key off to say, ‘You know what? There’s a gene in this pathway that the world doesn’t know about yet.’”
The company’s grandiose drug discovery initiative is yielding clinical results. Recursion’s proprietary oncology program is eight candidates deep. It’s got two more mid-clinical programs in rare disease and one in C. difficile Colitis, and it’s partnering on another dozen projects in oncology, neuroscience, inflammation and immunology, and rare disease. A major partnership with Roche’s Genentech, for example, is enabling the might of big biopharma to leverage the Recursion OS data set in exploration of up to 40 programs in neuroscience and oncology.
We talked with Gibson about Recursion’s bleeding- edge approach to ML-enabled understanding of human biology for nearly an hour. Give the episode a listen for a glimpse into the future of drug discovery.