Research

Research in the Genetic Epidemiology and Risk Assessment Program focuses on gaining a better understanding of the causes and risk factors behind cancer development and cancer outcomes.

Results of these scientific findings are translated into improvements across the entire continuum of human cancer, including primary prevention, better risk stratification, screening and diagnosis, as well as clinical management and treatment, quality of life, and long-term survivorship.

Investigators in the Genetic Epidemiology and Risk Assessment Program conduct research in several areas:

  • Studies of families to identify genes that cause familial and hereditary forms of cancer to help clinicians provide appropriate genetic counseling to patients.
  • Studies to identify how genes and environment and lifestyle factors act alone and in combination to increase or decrease cancer risk.
  • Studies that explore the epidemiology of conditions that lead to cancer or conditions that are high-risk markers for cancer development. This information helps researchers better understand how cancer arises and find new ways to identify people at increased risk and helps prevent these conditions from leading to cancer.
  • Using molecular epidemiology to better define subtypes of cancer based on molecular characteristics of a tumor, and relating these differences back to genetic and environmental risk factors.
  • Studies that help researchers understand the role of genetic variation in cancer prognosis. Study results are used to better individualize cancer treatment — increasing treatment efficacy and decreasing toxicity — and to identify new treatment approaches.
  • Incorporating tumor biomarkers into studies to better understand their significance to underlying cancer biology and to determine how they might be used as targets for new drug therapies and as predictors of response to therapy.
  • Leveraging large patient cohorts to study predictors of medical outcomes related to cancer survivorship, such as second cancers, cardiovascular disease, bone health and cognitive function; health-related quality of life, sexual health, physical and psychosocial functioning; and health behaviors over the long term.

Research goals

The Genetic Epidemiology and Risk Assessment Program has three main goals:

  • Understand genetic factors, environmental factors and gene-environment interactions in the causes of cancer in human populations
  • Understand the molecular epidemiology of cancer outcomes and survivorship
  • Develop and apply novel statistical and informatics methods for the design and analysis of genetic and molecular epidemiology studies

Understanding genetic factors, environmental factors and gene-environment interactions in the causes of cancer in human populations

Research advances in this area include:

  • Reporting that probands with familial pancreatic cancer carry more mutations in BRCA1, BRCA2, PALB2 and CDKN2A than do probands with nonfamilial pancreatic cancer. Mutations in these genes were present regardless of familial pancreatic cancer status. These genes are now included on genetic testing panels available to physicians.
  • Co-leading the non-Hodgkin's lymphoma genome-wide association study initiative that identified multiple novel loci for chronic lymphocytic leukemia, follicular lymphoma, diffuse large B-cell lymphoma and marginal zone lymphoma.
  • Demonstrating that there is no association of rs61764370, an inherited variant residing in a KRAS 3′ UTR microRNA binding site, with either risk or clinical outcome in breast cancer or ovarian cancer. This finding led to a recommendation not to use this test in clinical practice.
  • Linking germline variants encoding miRNA to colorectal cancer subtypes and determining that telomere length is associated with age at onset of colorectal cancer.
  • Defining a molecular subclass of rectal cancer based on the correlation of chromosomal instability, telomere length and telomere maintenance in microsatellite-stable rectal cancer.
  • Providing evidence that the biological mechanisms of lung cancers that develop in ever-cigarette smokers versus never-cigarette smokers differ at the genomic and cellular levels.
  • Identifying disparities in breast density awareness and knowledge by race/ethnicity, educational level and income, supporting targeted efforts to improve breast density awareness and knowledge.
  • Delineating the epidemiology, genetics and biology of monoclonal gammopathy of undetermined significance (MGUS) and its progression to multiple myeloma.
  • Finding that 52 percent of monoclonal B-cell lymphocytosis had nonsynonymous driver mutations, with some mutations detected 41 months prior to progression to chronic lymphocytic leukemia.
  • Demonstrating that a THBS2 and CA19-9 blood marker panel measured with a conventional ELISA can be used to improve the detection of patients at high risk of pancreatic ductal adenocarcinoma.
  • Demonstrating that the established risk factors for pancreatic cancer, including smoking, diabetes, family history of pancreatic cancer and obesity, also applied to early-onset pancreatic cancer (before age 60), whereas alcohol use was age-dependent and most strongly associated with very early-onset pancreatic cancer (before age 45).
  • Demonstrating the need for more-sensitive lung cancer screening criteria by reporting that the proportion of patients with lung cancer in Olmsted County in Minnesota meeting the U.S. Preventive Services Task Force screening criteria decreased significantly from 1984 to 2011, with only 37 percent of female patients and 50 percent of male patients guideline-eligible for screening by CT during the recent interval.

Understanding the molecular epidemiology of cancer outcomes and survivorship

Research advances in this area include:

  • Classifying gliomas into five subgroups based on three tumor-based biomarkers (TERT promoter mutation, IDH mutation and 1p/19q codeletion). The five subgroups have different ages of onset, overall survival and associations with germline variants, implying that they have distinct mechanisms of pathogenesis. The new World Health Organization guidelines for the classification of gliomas now incorporate these three molecular markers.
  • Identifying loci at 5q23.2 and 6q21 as novel markers associated with poor outcome for patients treated with immunochemotherapy.
  • Reporting for the first time that patients with early-stage breast cancer with one HLA-DRB1*07:01 allele are at greater risk of lapatinib-induced liver injury, providing evidence of a potential genetic biomarker to predict liver injury and suggesting an underlying immune pathology.
  • Observing that genetic variation in MAPK8IP1 and SOCS3 is associated with better survival in patients undergoing potentially curative resection of pancreatic cancer.
  • Finding that ratios of CD8+ T cells to CD4+CD25+ FOXP3+ and FOXP3- T cells are associated with overall survival among patients with ovarian cancer.
  • Identifying cytokines and free light chains as independent prognostic biomarkers in Hodgkin's lymphoma, diffuse large B-cell lymphoma, follicular lymphoma, mantle cell lymphoma and peripheral T-cell lymphoma.
  • Finding that pretreatment peripheral blood markers, reflecting host immune conditions, predicted post-treatment survival outcomes in both non-small cell lung cancer and small cell lung cancer.
  • Identifying and validating higher topoisomerase IIa expression as independently associated with higher risk of cancer-specific death among patients undergoing surgery for clear cell renal cell carcinoma. This resulted in the launch of a clinical-grade immunohistochemistry test for topoisomerase IIa to help guide post-operative surveillance of patients with renal cell carcinoma.
  • Identifying four de novo transcriptional subtypes for high-grade serous ovarian cancer, the most common ovarian cancer subtype, and independently validating each as having prognostic relevance and implications to guide therapeutic decisions.
  • Reporting for the first time that tumor hypomethylation at 6p21.3 is associated with longer time to disease recurrence, suggesting that an epigenetically mediated immune response impacts recurrence or treatment response.
  • Developing a diagnostic lineage test based on genomic rearrangements from mate-pair sequencing that demonstrates promise for distinguishing independent primary from metastatic disease in lung cancer.
  • Reporting that despite a low actual risk level, 75 percent of women undergoing axillary node dissection and 50 percent of women undergoing sentinel node biopsy during breast surgery had persistent worry about lymphedema at one year follow-up, and that this worry was associated with a variety of psychosocial impacts, including reduced sexual satisfaction. This work has led to new referral service lines to provide targeted counseling to women who are concerned about lymphedema.
  • Finding clinically meaningful higher quality of life in multiple domains for lung cancer survivors who reported being physically active compared to those who were not physically active.

Developing and applying novel statistical and informatics methods for the design and analysis of genetic and molecular epidemiology studies

Research advances in this area include:

  • Showing in a survival study of pancreatic ductal adenocarcinoma using a time-varying Cox model that there was no benefit of metformin use after diagnosis, but there was a protective effect when employing a conventional Cox model. This finding highlighted the importance of patient selection and appropriate statistical analysis when studying medications and cancer survival.
  • Finding that DNA extraction method influences telomere length, which can introduce bias and has impact for the design and interpretation of epidemiology studies.
  • Finding that patients with diffuse large B-cell lymphoma who are event free (no relapse or re-treatment) at 24 months after diagnosis have subsequent overall survival equivalent to that of the age-matched and sex-matched general population, whereas those with early failures have a very poor outcome. This finding has implications for patient management and clinical trial design.
  • Developing a new likelihood-ration test for genetic pleiotropy that provides a formal testing framework to determine the number of traits associated with a genetic variant while accounting for correlations among the traits.
  • Developing PatternCNV, a novel method to detect DNA copy number variations from whole-exome sequencing data, and reporting that this method has a higher sensitivity and specificity than do existing algorithms.
  • Developing RVBoost to obtain higher-quality reads. RVBoost uses a boosting method to train a model of good quality variants using common variants from HapMap and then prioritizing and calling the RNA variants based on the trained model.
  • Developing Cepip, a joint-likelihood method for estimating a genetic variant's regulatory probability in a cell/tissue context-dependent manner.
  • Developing MutD, a text mining system to more effectively extract associations between protein mutations and disease.
  • Developing an open-source medication extraction and normalization tool for clinical text, called MedXN, that extracts comprehensive medication information with high accuracy and good normalization.
  • Developing a natural language processing negation algorithm called DEEPEN, which reduces the number of incorrect negation assignments for patients with positive findings, thereby improving the identification of patients with specific findings in electronic health records.
  • Developing a data element repository approach that provides a standards-based semantic infrastructure to enable machine-readable quality data model data element services to support electronic health record-driven phenotype algorithm authoring and execution.