Share on:

Statistics and Informatics Software

The Center for Clinical and Translational Science (CCaTS), the Division of Biomedical Statistics and Informatics, and the CCaTS Biostatistics, Epidemiology and Research Design (BERD) Resource at Mayo Clinic collaborate to offer a series of online professional development modules on statistics and informatics software.

CME: Mayo Clinic College of Medicine is accredited by the Accreditation Council for Continuing Medical Education to provide continuing medical education for physicians.

Mayo Clinic College of Medicine designates this enduring material for a maximum of 1 AMA PRA Category 1 Credit(s) ™. Physicians should claim only the credit commensurate with the extent of their participation in the activity.

About JMP software

JMP, a statistical software package from SAS, is designed for dynamic data visualization. It allows study teams to obtain descriptive statistics and perform simple data analysis.

For those who need personalized assistance, the CCaTS Service Center also offers one-on-one statistical and epidemiological consultations.

All modules in this series are presented by Ross A. Dierkhising, a master's-level biostatistician who also consults through the CCaTS' BERD Resource.

Back to top

JMP I: Introduction to JMP for Research

"JMP Dataset Creation"

  • At the completion of this module, learners will be able to create a new dataset by entering data into a JMP table, define column properties for a variable, import a dataset from another file (such as Excel) and export a JMP dataset to another file type (such as Excel). Originally released July 1, 2012; renewed May 6, 2013; credit expires Dec. 31, 2016.
  • Mayo Clinic employees: Enroll now
  • Non-Mayo participants: Enroll now

"Creating New Variables in JMP Datasets Using Formulas"

  • At the completion of this module, learners will be able to describe the functions of the formula editor, calculate the difference between two numeric variables, calculate body mass index from weight and height, use if/then statements, and calculate a time interval. Originally released July 1, 2012; renewed May 6, 2013; credit expires Dec. 31, 2016.
  • Mayo Clinic employees: Enroll now
  • Non-Mayo participants: Enroll now

"JMP Dataset Manipulations"

  • At the completion of this module, learners will be able to subset rows from a dataset, sort data by one or more variables, concatenate two datasets, check for duplicate subjects in a dataset, and join two datasets. Originally released July 1, 2012; renewed May 6, 2013; credit expires Dec. 31, 2016.
  • Mayo Clinic employees: Enroll now
  • Non-Mayo participants: Enroll now

"Computation of Descriptive Statistics and How to Save Results in JMP"

  • At the completion of this module, learners will be able to describe how variable modeling types determine which statistics are computed, identify where to find specific descriptive statistics in the output, use a "by" variable to obtain descriptive statistics within groups, save output in various formats and journal an output to collate results. Originally released July 1, 2012; renewed May 6, 2013; credit expires Dec. 31, 2016.
  • Mayo Clinic employees: Enroll now
  • Non-Mayo participants: Enroll now
Back to top

JMP II: Using JMP to Apply Statistical Methods Commonly Used in Research

"Analysis of Means Using JMP"

  • At the completion of this module, learners will be able to conduct a one-sample test of a mean, conduct a two-sample test of means, compare means from more than two independent groups and compare two dependent means. Originally released July 1, 2012; renewed May 6, 2013; credit expires Dec. 31, 2016.
  • Mayo Clinic employees: Enroll now
  • Non-Mayo participants: Enroll now

"Analysis of Proportions Using JMP"

  • At the completion of this module, learners will be able to conduct a one-sample test of a proportion, conduct a two-sample test of proportions, compare proportions from more than two independent groups, conduct a one-sample test for a multinomial distribution, compare multinomial distributions from independent groups and compare two dependent proportions. Originally released July 1, 2012; renewed May 6, 2013; credit expires Dec. 31, 2016.
  • Mayo Clinic employees: Enroll now
  • Non-Mayo participants: Enroll now

"Linear Regression and Correlation Using JMP"

  • At the completion of this module, learners will be able to fit a simple linear regression model with a continuous or categorical predictor, estimate correlation coefficients and fit a multiple linear regression model. Originally released July 1, 2012; renewed May 6, 2013; credit expires Dec. 31, 2016.
  • Mayo Clinic employees: Enroll now
  • Non-Mayo participants: Enroll now

"Logistic Regression and ROC Curves Using JMP"

  • At the completion of this module, learners will be able to fit a simple logistic regression model with a continuous or categorical predictor, construct an ROC curve from a model with one continuous predictor and assess cut-off values, fit a multiple logistic regression model, and construct an ROC curve from a model with multiple predictors and assess cut-off values. Originally released July 1, 2012; renewed May 6, 2013; credit expires Dec. 31, 2016.
  • Mayo Clinic employees: Enroll now
  • Non-Mayo participants: Enroll now

"Survival (Time to Event) Analysis Using JMP"

  • At the completion of this module, learners will be able to estimate Kaplan-Meier survival and failure curves, properly estimate the median time to event, compare Kaplan-Meier curves between groups, fit a univariate Cox proportional hazards model with a continuous or categorical predictor, fit a multivariable Cox proportional hazards model, and recognize predictors that JMP cannot handle (time-dependent covariates). Originally released July 1, 2012; renewed May 6, 2013; credit expires Dec. 31, 2016.
  • Mayo Clinic employees: Enroll now
  • Non-Mayo participants: Enroll now
Back to top

Bioinformatics software

These modules are intended for researchers who want to learn about the technologies available in the Medical Genome Facility (formerly the Advanced Genomics Technology Center) at Mayo Clinic and receive training on commercial and public bioinformatics software, public bioinformatics databases and genome browsers.

Effective use of bioinformatics software enables researchers to study — on a genome-wide scale — gene expression, exon composition of transcripts, protein binding sites, genotypes, gene copy number variations, DNA methylation and other molecular events.

All modules in this series are developed by Alexey A. Leontovich, Ph.D., of Mayo Clinic in Rochester, Minn., who also oversees bioinformatics consulting in the CCaTS.

To view the online modules that might best fit your needs, see the Online Module Cross-Reference Guide.

"Overview of Bioinformatics Tools"

  • During a laboratory experiment, we may obtain RNA or DNA samples, which are then processed and analyzed. This module provides an overview of the technology that translates RNA or DNA information into a digital form — that is, computer files. It focuses on methods and software that are used to analyze these files and obtain biological interpretation of the results. Originally released April 1, 2014; credit expires Dec. 31, 2016.
  • Mayo Clinic employees: Enroll now
  • Non-Mayo participants: Enroll now

"Essentials of Microarray Technology"

  • High-throughput microarray technologies enable researchers to study gene expression, exon composition of transcripts, protein binding sites, SNPs, gene copy number variations and other molecular events on the genome-wide scale. There are numerous platforms, but each follows a similar basic concept that must be understood to allow researchers to design quality experiments. Originally released July 1, 2012; renewed May 6, 2013; credit expires Dec. 31, 2016.
  • Mayo Clinic employees: Enroll now
  • Non-Mayo participants: Enroll now

"Essentials of Microarray Technology: Affymetrix and Illumina Platforms"

  • High-throughput microarray technologies enable researchers to study gene expression, exon composition of transcripts, protein binding sites, SNPs, gene copy number variations and other molecular events on the genome-wide scale. Although there are multiple platforms implementing this technology, there are a number of key principles that are critical for understanding its potential as well as its limits.

    Affymetrix has additional arrays that can be utilized to ask specific questions of a sample. Illumina microarray technology has proven to be one of the best platforms for gene expression profiling, microRNA and DNA methylation profiling, and SNP detection.

    This module explains key principles and features of Affymetrix microarray technology, the background between exon arrays and tiling arrays using Affymetrix technology, and key principles and features of Illumina microarray technology. It also provides background information on learning how to analyze Illumina data. Originally released July 1, 2012; renewed May 6, 2013; credit expires Dec. 31, 2016.

  • Mayo Clinic employees: Enroll now
  • Non-Mayo participants: Enroll now

"Obtaining Data and Gene Expression Profiles from Gene Expression Omnibus Microarray Database and File Decompression"

  • Gene Expression Omnibus (GEO) is a free database maintained by the National Center for Biotechnology Information (NCBI) that holds thousands of datasets from published expression data. Public microarray databases are online repositories of microarray data of different types (gene expression, exon arrays and SNPs). They are often supplied with some data analysis and/or visualization tools.

    These databases contain data generated with different microarray platforms, including spotted arrays, Affymetrix, Illumina and Agilent. Researchers can use this site to gather preliminary data on a gene or genes of interest. This module explains how to obtain experimental data from public databases, search for genes of interest and download them. Originally released July 1, 2012; renewed May 6, 2013; credit expires Dec. 31, 2016.

  • Mayo Clinic employees: Enroll now
  • Non-Mayo participants: Enroll now

"Using Partek Genomics Suite for Microarray Data Analysis: The Basics"

  • Partek GS is an excellent software application for the analysis of gene expression, exon composition of transcripts, copy number variation, gene annotation and more. This is an introductory-level module that covers basic functionalities of the software. Originally released July 1, 2012; renewed May 6, 2013; credit expires Dec. 31, 2016.
  • Mayo Clinic employees: Enroll now
  • Non-Mayo participants: Enroll now

"Using Ingenuity Pathway Analysis Software for Gene Pathways Analysis"

  • Ingenuity Pathway Analysis is one of the main software applications for the analysis of molecular pathways, biological networks of genes and proteins, data mining of biological annotations, data visualization, and reporting tools. This is an introductory-level module that explains how to use the basic functionalities of the software. Originally released July 1, 2012; renewed May 6, 2013; credit expires Dec. 31, 2016.
  • Mayo Clinic employees: Enroll now
  • Non-Mayo participants: Enroll now

"Introduction to Cytoscape Software"

  • Cytoscape is a free software application for analysis and integration of gene network and gene interaction data. It is powerful software for integration and visualization of complex biological data of various types, such as complex gene networks, gene expression data (microarray, PCR or next-generation sequencing), methylation data, gene copy number variations and more.

    Cytoscape accepts various formats of gene/protein/metabolite interaction files, directly uploads data from a large number of databases, and has a large set of tools for gene/protein/metabolite functional analysis and annotation, including tools for Gene Ontology analysis.

    At the completion of this module, you will be able to install Cytoscape on your computer, load data into the software, learn main controls and tools, perform a simple analysis, and visually represent the results. Originally released Jan. 1, 2014; credit expires Dec. 31, 2016.

  • Mayo Clinic employees: Enroll now
  • Non-Mayo participants: Enroll now

"An Introduction to the Sequence Read Archive and Conversion of SRA Format to FASTQ Format" (New!)

  • Massively parallel sequencing technologies (next-generation sequencing, or NGS) are more and more used to quantitate genes, gradually replacing microarray technologies. Data generated by NGS platforms demands development of storage devices, data transfer methods and hardware that can efficiently handle very big volumes of data. This also extends to the data analysis software.

    In this module, we will show how NGS data obtained in gene expression experiments (RNA-seq) are archived in the National Institutes of Health database. We will also explain how this data can be retrieved from the archive and used for data analysis. Originally released July 1, 2014; credit expires Dec. 31, 2016.

  • Mayo Clinic employees: Enroll now
  • Non-Mayo participants: Enroll now

"Introduction to Integrative Genomics Viewer (IGV)" (New!)

  • Genome browsers are software for biological interpretation and visualization of data. With the advent of next-generation sequencing (NGS) technologies, genome browsers became one of the key components of analytical workflows. However, they can be used to mine genomics data and visualize results obtained by various types of technologies.

    One of the leading genome browsers is Integrative Genomic Viewer (IGV), which was developed at the Broad Institute. In this module, we will show how to use IGV for visualization and analysis of NGS data. Originally released July 1, 2014; credit expires Dec. 31, 2016.

  • Mayo Clinic employees: Enroll now
  • Non-Mayo participants: Enroll now

"Introduction to Galaxy Software"

  • Massively parallel sequencing, also called next-generation sequencing, generates massive amounts of data. Storage and analysis of this data requires specialized software and hardware. Galaxy is the major free system (software and hardware) that meets those requirements.

    Galaxy has software tools for the analysis of ChIP-seq, RNA-seq and DNA-seq data (including methylation analysis), transcription factor binding analysis, genotyping analysis, copy number variation, gene expression, and gene/DNA variant detection, as well as the EMBOSS package of tools.

    In this module, you will learn how to set up a free account with Galaxy, learn main controls and learn how to import data from the UCSC Genome Browser, which is well-integrated with Galaxy. You will also learn how to do a simple analysis using Galaxy. Originally released Jan. 1, 2014; credit expires Dec. 31, 2016.

  • Mayo Clinic employees: Enroll now
  • Non-Mayo participants: Enroll now

"Loading Data into Galaxy Software"

  • Data files generated in experiments using next-generation sequencing technology are very big — up to hundreds of gigabytes (Hi-seq). Smaller files (less than 2 GB) can be uploaded into the software directly from your computer, but to upload bigger files (up to 50 GB), you need to use FTP client software.

    This module shows how to upload small and big files into Galaxy. Originally released Jan. 1, 2014; credit expires Dec. 31, 2016.

  • Mayo Clinic employees: Enroll now
  • Non-Mayo participants: Enroll now

"ChIP-seq Analysis with Galaxy Software (Part 1)"

  • This module shows how to analyze changes in methylation status of DNA using data generated with ChIP-seq method — more specifically, methylated DNA immunoprecipitation (MeDIP) and sequenced with Illumina Genome Analyzer IIx. We use FASTQ files as an input data, so most techniques used in this analysis are applicable to other types of ChIP-seq data.

    The whole analysis involves many steps, so to make it easier to understand and learn, we divided it into three parts, each of which explains a particular analytical process. This first part shows how to identify DNA regions that have a different level of methylation in different samples. Originally released Jan. 1, 2014; credit expires Dec. 31, 2016.

  • Mayo Clinic employees: Enroll now
  • Non-Mayo participants: Enroll now

"Analysis of Genome-Wide Methylation Pattern Using Galaxy Software (Part 2)"

  • Various experimental treatments or biological states (such as embryonic development) may affect genome-wide methylation pattern: number of hypermethylated sites in introns, exons, in promoter regions and CpG islands.

    Once differentially methylated sites (hyper- and hypo-methylated) are identified, the next step in the analysis is to find differences in and characterize methylation pattern — that is, distribution of frequencies of differentially methylated sites in specified genomic regions. In this module, we explain how to do this type of analysis using Galaxy software. Originally released Jan. 1, 2014; credit expires Dec. 31, 2016.

  • Mayo Clinic employees: Enroll now
  • Non-Mayo participants: Enroll now

"Methylation Analysis of Promoter Regions Using Galaxy Software (Part 3)"

  • Currently, it is widely accepted in the literature that methylation of promoter regions causes suppression of gene expression. Specific locations of methylated sites may affect binding of transcription factors to the promoter. This is the reason why detailed analysis of methylation of gene promoters is especially important in the context of the whole study.

    In this module, we will explain how to analyze differential methylation of promoter regions on a genome-wide scale using Galaxy software. Originally released Jan. 1, 2014; credit expires Dec. 31, 2016.

  • Mayo Clinic employees: Enroll now
  • Non-Mayo participants: Enroll now

"Preparation of FASTQ Files for RNA-seq Analysis Using Galaxy Software" (New!)

  • It is common that you receive your data in BAM or FASTQ format files. It is preferable to start your analysis with FASTQ files to be able to check the quality of sequence reads and remove low-quality fragments. FASTQ files come in different flavors depending on the specifics of the sequencing platform that was used to generate them (for example, single read versus paired-end read).

    There are software tools developed specifically to convert BAM files into high-quality FASTQ files. In this module, we demonstrate how to install these software tools on your computer and process BAM files. Originally released Oct. 1, 2014; credit expires Dec. 31, 2016.

  • Mayo Clinic employees: Enroll now
  • Non-Mayo participants: Enroll now

"RNA-seq Analysis Using Galaxy Software" (New!)

  • Typically, the goal of the RNA-seq analysis is to identify genes that are differentially expressed in groups of samples that are being compared. This will help to find alternatively spliced transcripts in groups of samples and alternative transcription start sites.

    Starting with FASTQ files, it takes numerous analytical steps to obtain the final result. There are multiple parameters for each algorithm used at each step of the analysis, and these parameters need to be set correctly. In this module, we walk you through the major steps of the analytical workflow and explain how to correctly set the parameters. Originally released Oct. 1, 2014; credit expires Dec. 31, 2016.

  • Mayo Clinic employees: Enroll now
  • Non-Mayo participants: Enroll now
Back to top
  • Oct 21, 2014
  • ART716895