NEW YORK – Through a cross-country collaborative effort dubbed the Latin American Genomics Breast Cancer Consortium (LAGENO-BC), researchers are pooling resources to illuminate breast cancer risk variants and molecular features unique to Hispanic and Latina populations.
Though all the studies that fall under LAGENO-BC's aegis have their own aims and protocols, the collaborators are connected by an overarching goal to share data and samples across projects. So far, LAGENO-BC has amassed data from seven countries and 13 different studies involving about 14,000 breast cancer cases and 18,000 controls. Soon, collaborators in four more countries will be joining, adding 15 more studies and thousands more cases and controls. The consortium's goal as of now is to include about 32,000 cases and 28,000 controls in the consortium.
There are no hard and fast rules for the type of studies collaborators must be working on to join LAGENO-BC. The only requirement to join is to share data from a minimum of 200 breast cancer patients with the larger consortium, in turn enabling overdue genetic research in Latin American populations that have been historically left out of breast cancer risk studies.
It's no secret that genome-wide association studies (GWAS), which are the main resource used to generate polygenic risk scores for certain cancers and other inheritable diseases, have been rooted in overwhelmingly white, European populations. In breast cancer, a 2018 analysis showed that more than 130,000 breast cancer cases had been included in GWAS discovery studies, with 89 percent reflecting European ancestry, and only 2 percent reflecting Latin American ancestry.
Genetic researchers, recognizing that European-based findings aren't always transferrable to diverse populations across the globe, have accordingly been trying to catalyze GWAS efforts in non-European populations including in Latin America.
But conducting these studies can be difficult, especially when they involve international coordination, explained Laura Fejerman, an associate professor and director of the Women's Cancer Research and Care Program at the University of California, Davis. Fejerman, the lead coordinator for LAGENO-BC, is herself Argentinian.
"It's obvious that it's important to look at diverse populations to understand breast cancer risk in different parts of the world," Fejerman said during a presentation of the consortium's efforts at the American Association for Cancer Research (AACR) annual meeting earlier this month.
She highlighted how the timing and direction of human migrations hundreds of thousands of years ago paved the way for subpopulations across continents to develop differences in allele frequencies associated with breast cancer and other complex traits. Despite the knowledge that there were these genetic differences between populations, Fejerman said it took seven years after the first breast cancer GWAS was conducted in European populations for the field to conduct the first GWAS involving women from Hispanic populations.
That GWAS, conducted in 2014, included about 11,000 people. The size was small relative to European studies, and therefore offered only a fraction of the statistical power. In 2015, for contrast, a meta-analysis to assess breast cancer risk using GWAS in Europeans included data from 120,000 people. In subsequent years, "we were able to expand it a bit," Fejerman said, sharing that the next Latin American breast cancer GWAS in 2019 included 14,000 cases and controls.
These studies generated interesting, albeit limited, findings. For example, in 2014, Fejerman and colleagues published GWAS findings in Nature Communications detailing a risk variant on the estrogen receptor 1 (ESR1) gene and found that the minor allele of this variant, which is protective against breast cancer, originates from Indigenous Americans. Two SNPs, which Fejerman's team found to be relatively common in Latin American populations, are rare or absent in other groups, including Europeans, suggesting that they are specific to populations with Indigenous American ancestry.
These findings were proof of principle. As Fejerman noted, "we did this with a tenth of the sample size used for studying breast cancer in Europeans, and [still] we discovered a variant with a strong odds ratio." The reason these variants hadn't appeared in prior research, she added, was they included so few non-European subjects.
For researchers to continue to illuminate findings like these — and determine their meaning for breast cancer risk and patient care among Latin American populations — much larger efforts are needed. To detect variants similar to what's been detected in Europeans, she said, "we need alleles to be at least 20 percent allele frequency," which is not possible to find without large numbers of patients from specific ancestries.
"We're doing GWAS in Latina populations and not finding a lot right now," she said. "[But] it's not because [these findings] aren't there … It's because we are underpowered to find them."
Joining forces across Latin America
Although Fejerman's lab is based in California, the LAGENO-BC consortium draws its strength from collaborations across Latin America.
After the initial GWAS that revealed the protective variant associated with breast cancer, Fejerman shared that her team looked at data from the 1000 Genomes Project and found the highest frequency of this allele in the Peruvian population. This was the impetus for Fejerman to approach the Instituto Nacional de Enfermedades Neoplasicas (INEN) in Peru.
"We started collaborating and putting together a breast cancer cohort," Fejerman recounted. "By now, we have recruited more than 2,000 participants, [and] we have genome-wide genotype data for almost all of them."
Through this collaboration, the teams began organizing a GWAS to further explore this protective variant and analyze how it differentiates risk across different breast cancer subtypes. They soon realized they would also need genetic information from a control group, a cohort of women without breast cancer. While the INEN did not have this dataset, Fejerman found a collaborator at Harvard Medical School who was studying the genetics of pregnancy outcomes in a Peruvian population.
"We started collaborating so that we could look at genetic risk comparing the cases that we have from INEN and the controls that we had [from Harvard]," she said, adding the caveat that the control group included genetic information from women who were significantly younger, but otherwise the demographics were quite similar.
Using these data, the researchers tested out a 313-SNP breast cancer polygenic risk score. The predictive power of that score, which was based on discoveries in Europeans, improved greatly once Fejerman and her team added in the protective, ancestry-specific variants, Fejerman told Precision Oncology News.
One of the biggest lessons from these findings, she said, has been the importance of adding ancestry- and population-specific variants to improve the applicability of breast cancer PRS. "All our findings around PRS suggest that … identifying population-specific variants, such as the protective SNP near the estrogen receptor 1 gene, is important to further increase the predictive power of the PRS in Hispanics/Latinas."
Latin America is far from homogenous, with many countries having very little Indigenous ancestry and vice versa. For this reason, it's important to have research cohorts reflective of that diversity and involve patients from all over Latin America including from individual countries.
"There's potential for improving the PRS by doing more fine mapping in the Latin-American population," Fejerman said during her AACR presentation. "We definitely need to find the Indigenous American-specific variants."
Due to variations in cancer registries across Latin American countries, as well as significant resource gaps, collaboration is the best way to find these patients, she reflected. The LAGENO-BC consortium includes researchers from various countries in Latin America as well as those in the US recruiting populations that self-identify as Latina or Hispanic.
Some of the studies may begin in the US and incorporate foreign populations — as was the case with the Harvard study of Peruvian pregnancies — whereas others, like the Cancer de Màma (CAMA) study in Mexico — a population-based case-control study of breast cancer risk that began in Mexico — later incorporated US-based Latin American cohorts.
The specific aims of the studies under the LAGENO-BC umbrella range from analyses of breast cancer molecular subtypes across countries to more focused, country- and ancestry-specific analyses of breast cancer genetic heterogeneity. For instance, in the Latin American Cancer Research Network-Molecular Profile of Breast Cancer Study (LACRN-MPBCS), researchers are trying to identify characteristics associated with breast cancer prognosis by analyzing molecular, genetic, and clinical data from patients in Mexico, Chile, Brazil, Argentina, and Uruguay. In the Puerto Rico Breast Cancer Genetics and Genomics Study (PUR-BCGG), meanwhile, investigators are mapping the genetic heterogeneity in Puerto Rican breast cancer patients according to ancestry.
"This is a population that has a unique history of migration," Julie Dutil of Ponce Health Sciences University in Puerto Rico explained in an overview of the PUR-BCGG study at AACR. "It's a three-way admixed population that originated from the admixture of the Spanish, African, and Indigenous Americans, and we really want to see how that history can influence the risk of breast cancer."
As LAGENO-BC grows, the hope is, investigators will finally have large enough research cohorts to discover new breast cancer risk variants specific to Latin-American women, home in on underlying disease mechanisms, and improve polygenic risk scores in Latinas, including ancestry-specific and country-specific scores.
LAGENO-BC, of note, is partially supported through a larger breast cancer genetic research effort called Confluence, through which the NIH is working to procure some 300,000 cases and 300,000 controls from all ancestries to study breast cancer risk in a large and representative population. "Luckily, [the NIH] want to make sure that there is diversity, so they are giving us some support to try to get this [LAGENO-BC] consortium going," Fejerman said during AACR, explaining later to Precision Oncology News that the Confluence program specifically supports the salary of a LAGENO-BC dedicated data manager as well as the cost of shipping samples from labs in the US and Latin America to an NCI lab conducting genome-wide genotyping on the same Illumina platform. That genotyping project will continue until the end of this year.
"Any other cost, such as the program manager for LAGENO-BC, I am paying for using my own startup funds at UC Davis," she said, explaining that going forward, she'd love to apply for R01 funding from the NCI for specific aims based on the LAGENO-BC data going forward. "Ideally, we would want to be able to apply to a mechanism focused on consortium infrastructure, but this type of grant does not currently exist," Fejerman added.
Challenges, future directions
As the consortium continues to grow, the researchers will have to grapple with several challenges, including finding a way to secure this infrastructure funding and ensure sustainability. Especially among the Latin American investigators in countries with fewer resources, funds are already inadequate to support efforts to gather additional patient data to make their datasets richer.
"Resources are very limited in Latin America," Fejerman said during AACR. "It's never been good, and it's even harder after COVID. Many countries are struggling … We need to try to support investigators in Latin America so we can actually work together."
Logistics are also tough to manage across countries. Coordinating meetings and communicating effectively requires infrastructure that the consortium is still working to build. And because the studies in the consortium have disparate aims and protocols, it is challenging to study gene-environment interactions when, for example, some groups lack data on environmental exposure or other risk factors.
Ultimately though, the benefits of pooling resources to build a large sample size with diverse representation outweigh all of the logistical barriers, Fejerman said.
"There are pros and cons … but I think it's worth it in the end because we couldn't do it alone," she said. "The sustainability and growth of an effort such as this require funding and support from institutions that are interested in diversifying data … so we can learn about everyone and provide precision medicine for everyone independent of their ancestry."