NEW YORK – A newly launched research project aims to build the world's largest database of spatial omics data in hopes of advancing cancer biomarker and drug discovery through bioinformatic analysis. Lead sponsor Owkin also sees it as a means to position itself as a "drug discovery enterprise" rather than a diagnostics-focused company, according to Joseph Léhar, the firm's senior VP of R&D strategy.
To reach that goal, Owkin needs to access "distinct datasets that are different from what anyone else has … because it gives you an opportunity to discover targets that haven't yet emerged," Léhar said.
The program, called Multi Omic Spatial Atlas in Cancer, or MOSAIC, will collect 7,000 tumor samples — far more than any current spatial omics dataset, according to Owkin — then use spatial omics techniques to analyze the samples and discover immuno-oncology disease subtypes to assist in biomarker and drug discovery.
MOSAIC is the centerpiece of a coalition that Paris-based artificial intelligence and bioinformatics firm Owkin recently formed with NanoString Technologies and groups at European and American research institutions to employ spatial omics technologies for advanced cancer research. Owkin also committed $50 million to the effort over the next 10 years.
NanoString brings spatial profiling technologies to the MOSAIC research program, particularly spatial transcriptomics and proteomics.
COO Hubert Chaperon said that R&D is a "core business" of Owkin now, so the $50 million investment is rather small compared to the cost of running clinical trials. Still, the company has not ruled out seeking grants or other investors to support MOSAIC in the future.
"It's actually a reasonable investment to set up for the number of trials that we'd like to undertake," Léhar said.
The firm is certainly well-capitalized. Following a $180 million equity investment by Sanofi in 2021, Owkin claimed "unicorn" status, a privately funded startup valued at more than $1 billion.
Last year, Owkin formed collaborations with Stanford Medicine and ADC Therapeutics to apply AI to molecular and other data to inform drug and biomarker discovery for other cancers. This work so far has not included any spatial omics, so MOSAIC represents new territory for the French firm.
If the program works very well, there is a chance Owkin might expand the scope of MOSAIC either by adding cancer indications or new technologies to the mix, according to Chaperon.
Owkin and NanoString debuted MOSAIC at the annual American Society of Clinical Oncology (ASCO) meeting in Chicago earlier this month. Participating research organizations are Institut Gustave Roussy in France, the University of Pittsburgh Medical Center's Hillman Cancer Center, Lausanne University Hospital in Switzerland, and Uniklinikum Erlangen/Friedrich-Alexander-Universität Erlangen-Nürnberg and Charité-Universitätsmedizin Berlin in Germany.
The MOSAIC study will focus on what Owkin labeled "difficult-to-treat" cancers, covering seven indications: non-small cell lung cancer, triple-negative breast cancer, diffuse large B-cell lymphoma, ovarian cancer, glioblastoma, mesothelioma, and bladder cancer. The coalition will lean on Owkin's AI-based data science technology as well as NanoString's GeoMx digital spatial profiler and CosMx spatial molecular imager platforms.
Léhar said that the motivation behind MOSAIC was the desire to improve understanding of tumor-immune interactions. With rare exceptions including PD-L1 and CTLA4 inhibitors, there are not many drugs now that target tumor-immune interaction.
"But those were only able to be discovered because they are so ubiquitous across the whole tumor," he explained.
The prevailing method of taking a biopsy and creating a single gene-expression profile of a tumor is "highly informative," according to Léhar, but it still has shortfalls. "If you have different things happening in different parts of the tumor, you'll miss most of the … interesting interactions," he said.
This has led to growth of spatial technologies that essentially perform gene-expression profiling in many locations.
"But the major problem [with spatial omics approaches] is that they're very expensive and they're difficult to work with," Léhar said. MOSAIC aims to sort through some of those difficulties and reduce cost per patient by scaling up the amount of data available.
Owkin and its partners want to generate spatial omics profiles on 7,000 patients in the next two years. Then the focus will turn to research, though Chaperon said that research on the datasets could begin before data generation is complete.
Léhar said that current spatial omics datasets do not have more than a few dozen samples. "When you have got a few dozen cases, you understand how different pockets of the human population respond to the cancer in the tumor-immune sense," he said.
Thousands of cases per indication would open up new research possibilities and lead to new discoveries, according to Léhar.
Goals of the MOSAIC program include uncovering novel insights into tumor-immune systems, potentially discovering subtypes of cancer indications, and, ultimately developing new therapies. "The endgame is better treatment, particularly in the context of personalized medicine, by identifying subtypes of patients within each indication," Chaperon said.
A joint steering committee will meet quarterly to set the direction for how MOSAIC research evolves, including when and how the founding partners bring in additional participants. Exact specifications of research projects will be in the hands of the academic partners. Owkin, as the sole initial funder of MOSAIC source, will have more committee voting rights than the other partners, according to Chaperon.
He said that Owkin wants to bring in at least one more US-based research institution, and he expects to receive inquiries as word gets out about the MOSAIC study.
A key partner yet to be announced will be responsible for both RNA and exome sequencing. "Our goal is to have [sequencing] as centralized as possible" in an effort to standardize processes and avoid bias from batch effects," Chaperon said.
Wet-lab operations will not be completely centralized, however, as MOSAIC likely will need to have sequencing on both sides of the Atlantic due to the European Union's General Data Protection Regulation (GDPR) and other regulations, according to Chaperon.
Léhar noted that data generation will be a two-step process, with both actual sequencing of biological samples and then spatial omics processing. Each institution will be responsible for collecting samples and running the spatial omics on NanoString instruments.
Each of the five founding academic partners will be responsible for generating at least 1,000 samples, and there may be additional centers brought in later.
Raphael Gottardo, director of the Biomedical Data Science Center at Lausanne University Hospital —known by its French acronym, CHUV — said that the goal of 7,000 total samples "is unheard of in of spatial transcriptomics."
Gottardo, the principal investigator for MOSAIC at CHUV, said that he and his counterparts at the other four research centers have been working with Owkin for months to choose the indications, define the key questions they want answered, and select the technologies they would need to reach their goals.
Gottardo said that CHUV has a center of excellence in spatial transcriptomics that relies on NanoString's GeoMx, though his lab uses 10x Genomics technologies such as Visium as well. Gottardo is interested in benchmarking technologies, so he said that CHUV will be participating in MOSAIC in part to determine if 10x, NanoString, or some other company's technology is best for spatial omics work.
"My guess right now is that there probably will be a combination of different platforms because … depending on the tissue type, depending on the questions, maybe one will be better than the other," he said.
Gottardo a French national who holds a PhD in statistics from the University of Washington returned to Europe in 2021 after working as a cancer-focused computational biologist at Fred Hutchinson Cancer Center in Seattle.
He has long worked at the nexus of statistics, machine learning, and high-throughput biology, starting from when microarrays were the state of the art in gene expression research and eventually progressing to single-cell sequencing and spatial technologies. Fred Hutch partnered with NanoString, 10X Genomics, and other genomics instrumentation suppliers.
True to his background, Gottardo said he is particularly interested in the computational aspects of MOSAIC. "But obviously, the big thing that we're after is the biology," he added.
"I think there's been a lot of hype around spatial biology, and I think it's clear that it's going to be super informative," Gottardo said. But it is hard to know for sure without larger datasets what the benefits will be or whether it will be more useful than technologies like as single-cell transcriptomics.
"Do we need to do one technology or do we need to do multiples? Do you need to have single-cell RNA-seq and special transcriptomics?" Gottardo wondered. "How much can you learn from just traditional [hematoxylin and eosin] staining?"
CHUV has only used CosMx with public data, not with any data generated in house. This has more to do with a manufacturing backlog at NanoString when the Swiss hospital expressed interest in being a beta tester for the technology, according to Gottardo.
Stéphanie Tissot, general manager of the university's Immune Landscape Laboratory, has a lot of experience with GeoMx and has generated some pilot and benchmarking data for Owkin. "We're really trying to inform them across the board in terms of technologies, benchmarking, testing, and even analysis of the data," said Gottardo, the only PI at any of the MOSAIC centers with a background in computational technology.
Owkin's 10-year commitment is for creating the datasets, maintaining the technology platform, and eventually pushing the datasets into the public domain.
Owkin's Chaperon said there will be a "research embargo" for as long as 10 years, during which time only MOSAIC partners will have access to the datasets, though a portion of the data could be publicly released sooner. That decision also will rest with the JSC, and Chaperon said that public research datasets must necessarily comply with GDPR.