Statistical Concepts

A broad range of statistics-related modules are available online.

Mayo Clinic employees: Enroll now

  1. Find CCaTS online modules in My Learning.
  2. Click "Learning" and search by course title or course code.

General concept modules

'Study Designs Commonly Used in Clinical Research' (502E00CMS110020)

This module is designed for the investigator needing assistance in determining the design of a research study. Participants learn the fundamental concepts of a variety of study designs used in research studies, as well as the strengths and weaknesses of the designs. Participants explore the different randomization techniques and when it's best to use each.

'Avoiding Statistical Pitfalls' (502E00CMS090016)

Statistical tests are often abused and misused, which can result in study findings that are incorrect and sometimes misleading. This module reviews common pitfalls and explains how to avoid them.

'Basics of Statistics' (502E00CMS090017)

Many research studies involve the use of quantitative statistical tests. Researchers who understand how statistical tests work — including what they can and cannot do — are better poised to critically review medical literature and design and execute studies.

'Clinical Data Management' (502E00CM110025)

This module explores the dynamic relationships among people, processes and technology in clinical data management. This module is helpful for anyone with an interest in improving data quality on research studies, particularly anyone involved in the conduct of a study under an investigational new drug application or investigational device exemption.

Participants develop techniques to better ensure quality data while gaining an appreciation for how federal regulations may affect decisions made with respect to the chosen data management plan.

'Data Basics: Understanding and Illustrating Research Data' (502E00CMS110019)

This module introduces various data types commonly seen in research studies. Proper identification of the data type is essential to summarize and analyze the data appropriately. The target audience is learners with minimal training in statistics who are new to working with data.

The module discusses summary measures for qualitative and quantitative data, including measures of central tendency ("averages") and variation (the "spread" of the data).The module also illustrates ways that data can be graphically displayed to uncover important attributes or limitations of the data.

'Working With a Statistician' (502E00CMS090018)

Most researchers rely on statisticians to help them analyze their results. What information should you take to your statistical consultation? You and your statistician will both be glad you learned the answer in this module.

'Improve Statistical Reporting by Avoiding Common Mistakes' (502E00CMS130050)

While researchers strive to have their research presented in the literature clearly and without errors, there are many common mistakes that occur when reporting study results. With the widespread use of computers in research, these statistical mistakes are often not a result of computational errors, but rather mistakes in statistical reporting that are more subtle in nature.

This module provides examples from the literature and experiential knowledge regarding common mistakes.

'Designing Effective Pilot Studies' (502E00CMS130052)

Pilot studies are initial investigations that form the basis for further research, and the aims of these pilot studies must be appropriately focused on estimating variability as well as demonstrating feasibility. Designing such pilot studies can be a challenge due to the limited amount of data available. In this module, simple approaches are proposed to help formulate pilot study designs.

'Research Protocols: Guide to Success' (502E00CMS120037)

Scientific advances are based on reproducible science. At the heart of reproducible science is the research protocol. This module discusses guidelines for protocol preparation consistent with regulatory requirements and best practices. Participants also learn to differentiate a research protocol from a grant application.

The module makes recommendations on the importance for specific and measurable outcome measures in the context of mandatory reporting requirements (for example, results reporting). Additionally, it highlights institutional resources that can assist investigators with protocol preparation.

'Beyond Bar and Line Graphs: Time for a New Data Presentation Paradigm' (502E01CATS15093)

Data presentation is an essential skill for scientists. Figures are critically important because they often show the data supporting key findings. However, a visually appealing figure is of little value if it is not appropriate for the type of data being presented. This module teaches participants how to select the right type of figures when presenting continuous data in small sample size studies.

The module's systematic review of research articles published in top physiology journals shows that most papers presented continuous data in bar and line graphs. This is a problem, as many different data distributions can lead to the same bar or line graph. The full data may suggest different conclusions from the summary statistics. Papers rarely include scatterplots, box plots and histograms that allow readers to critically evaluate continuous data.

The module directs learners to several resources that allow them to quickly make univariate scatterplots for small sample size studies, including Excel templates and instructions for GraphPad Prism. It also provides links to blog posts from other investigators that show how to make univariate scatter plots and box plots in R. Finally, the module discusses steps that students and investigators can take to improve the quality of data presentation in published papers.

Hypothesis testing, power and sample size modules

'Sample Size and Power Considerations: Precision and Hypothesis Testing' (502E00CMS110028)

This module outlines basic principles for estimating sample size calculations for research studies. It combines concepts of statistical precision with statistical assumption to yield a foundation that can be generalized to more-complex research settings.

'T-Tests and ANOVA Models' (502E00CMS110021)

T-tests and ANOVA models are statistical methods commonly used in research. As is detailed in this module, they are routinely used to compare mean values between two or more groups.

The module introduces each test by providing the rationale for the test and the proper interpretation of the results. Participants should already be familiar with basic calculations such as means, standard deviations and proportions.

'Common Statistics to Compare Two Proportions' (502E00CM110026)

This module is intended for researchers who seek a greater understanding of the statistics that are used to summarize the difference in two proportions. These statistics, which are collectively known as measures of association, are readily calculated from summary data, provided the data are tabulated as described in the module.

Relative risk and odds ratio — two of the most common measures of association for independent groups — are the focus of this module. The proper interpretation of each measure is provided along with examples illustrating the necessary calculations.

'Role of Nonparametric Statistics in Medical Research' (502E00CMS110030)

Nonparametric statistics is a branch of statistics that addresses the need for statistical methodology that's robust to make assumptions regarding the underlying distribution from which sample data are drawn. In the context of medical literature, the ubiquitous "bell-shaped curve" of the normal distribution may not match the data on hand.

Special nonparametric tests have been developed to allow for such deviations and alleviate the need for complicated transformations of the data. This module develops and discusses several commonly reported nonparametric tests as they pertain to actual data.

The module also introduces the concept of "exact" statistical tests and describes how they can be used to support the analysis of research studies, particularly those with small sample sizes. Participants should already have a basic understanding of hypothesis testing, p values and common statistical tests (such as the t-test).

'Estimating Sample Size for T-Tests and One-Way ANOVA Models' (502E00CMS130051)

This module presents the basic principles for estimating sample size calculations for research studies that use either a t-test or an ANOVA model for the primary analysis. This module combines the concepts of statistical precision with statistical errors (type I and type II errors) to yield a foundation that can be generalized to more-complex research settings.

Multivariable method modules

'Correlations and Partial Correlations' (502E00CMS110024)

This module presents the concept of "statistical adjustment" in the context of correlations and partial correlations. A partial correlation, as is detailed in the module, is a measure of the linear association of two variables after statistically controlling (adjusting) for the effects of one or more additional variables.

These partial correlations serve as the basis for interpreting the importance of predictors in a regression model and are related to a useful measure of effect in regression analyses — namely, the partial coefficient of determination. This module develops these concepts using simple graphs and motivating examples. Learners should already have a working understanding of regression techniques and hypothesis testing.

Advanced method modules

'Assessing Diagnostic Accuracy' (502E00CMS110022)

The ability to assess and interpret diagnostic accuracy is essential in both research and clinical settings. This module examines a variety of statistical measures found in the literature pertaining to screening and diagnostic tests. The module emphasizes correct interpretation of the statistical measures that summarize the diagnostic accuracy of a screening test. It also discusses the relationship between sensitivity and specificity as it relates to interpreting these results.

'Mechanics of Statistical Monitoring' (502E00CMS110029)

This module emphasizes the potential pitfalls associated with analyzing clinical trial data as it accrues during the course of the study. The concepts of "multiplicity" and the potential for an increased probability of a type I error (false positive result) serve as the foundation for the module.

The module defines statistically oriented terminology that is associated with interim monitoring (such as information fraction, stopping boundaries and alpha spending functions) and illustrates the terminology through examples. The module concludes with specific recommendations for determining the number and frequency of interim analyses in the course of a study, guidance on selecting the stopping boundary, and unique considerations for interim monitoring of safety data.

'Using Propensity Scores for the Analysis of Observational Studies' (502E00CMS110023)

Propensity scores are an important statistical advancement in the analysis of observational studies. The rationale for propensity scoring stems from the ideas of a counterfactual experiment. This module explores the motivation for — and estimation and use of — propensity scores.

The module contrasts the propensity score approach with other approaches that can be considered for the analysis of observational data. Learners should already have an understanding of basic study designs, use of logistic regression and common measures of association (such as the odds ratio).

'Longitudinal Summary Statistics' (502E00CMS120036)

Many research studies gather repeated observations on participants. These repeated observations are naturally correlated within a person, and this dependency must be accounted for to ensure appropriate statistical inference. This module presents an approach called longitudinal summary statistics, which summarizes the repeated observations into a single response value.

Once this summarization process is implemented, standard statistical methods can be applied to the data. The module makes recommendations for summary statistics in accordance with a study's aim.

  • Complexity: Moderate to advanced. The successful student will have had some exposure to regression modeling and general parametric and nonparametric approaches to data analysis.
  • Presenter: Rickey E. Carter, Ph.D.