The GeneSetScan software offers a general approach to scan genome-wide SNP data for gene-set association analyses.

The test statistic for a gene set is based on score statistics for generalized linear models and takes advantage of the directed acyclic graph structure of the gene ontology to create gene sets. The method can use other gene-set structures, such as the Kyoto Encyclopedia of Genes and Genomes (KEGG), or even user-defined sets.

The approach of the Statistical Genetics and Genetic Epidemiology Laboratory combines SNPs into genes, and genes into gene sets, but ensures that positive and negative effects on a trait do not cancel. To control for multiple testing of many gene sets, the lab uses an efficient computational strategy that accounts for linkage disequilibrium and correlations among genes and gene sets, and provides accurate step-down adjusted p values for each gene set.

See: Schaid DJ, et al. Using the gene ontology to scan multilevel gene sets for associations in genome wide association studies. Genetic Epidemiology. 2012;36:3.


This software offers an R package that performs gene-level kernel and burden association tests for genetic variants with disease status and continuous traits for pedigree data and unrelated subjects.

See: Schaid DJ, et al. Multiple genetic variant association testing by collapsing and kernel methods with pedigree or population structured data. Genetic Epidemiology. 2013;37:409.


Armitage trend chi-square statistics to evaluate the association of a trait with SNP genotype predictors given a dose vector of length 3. Distributed for S-PLUS and R with special installation steps given in the README. [01/2008]


Simultaneously estimate a trait-locus position and its genetic effects for affected relative pairs by one of two methods. Either allow a different trait-locus effect for each ARP type, or constrain the trait-locus effects according to the marginal effect of a single susceptibility locus. We include a goodness of fit statistic for the constrained model. README and package sources provided for S-PLUS and R. [04/2005]


A method to compute composite measures of linkage disequilibrium, their variances and covariances, and statistical tests, for all pairs of alleles from two loci when linkage phase is unkown. An extension of Weir and Cockerham (1989) to apply to multi-allelic loci. README and package sources are provided for S-PLUS and R. [12/2006]

GASSOC Software

Statistical methods for genetic associations using cases and their parents. [06/2001]


This software offers a suite of R routines for the analysis of indirectly measured haplotypes.

The statistical methods assume that all subjects are unrelated and that haplotypes are ambiguous (because of unknown linkage phase of the genetic markers). The genetic markers are assumed to be codominant (that is, 1-to-1 correspondence between their genotypes and their phenotypes). [updated December 2013]


Test the fit of genotype frequencies to Hardy-Weinberg Equilibrium proportions for autosomes and the X chromosome. Different statistical tests are provided, as well as an option to evaluate statistical significance by either exact methods or simulations. README and package sources are provided for S-PLUS and R. [02/2011]


Calculates an exact stratified test for HWE for diallelic markers, such as single nucleotide polymorphisms (SNPs), exact tests for HWE within each stratum, and an exact test for homogeneity of Hardy Weinberg disequilbrium. An update for version 1.0 verifies if the exact test for homogeneity can be computed; if not, the program calculates the p-value using an asymptotic test. Written in the C programming language, available as executable for Linux x_86_64 and Solaris, in addition to the source code. [05/2011]


A package in S-PLUS and R to test genetic linkage with covariates by regression methods with response IBD sharing for relative pairs. Account for correlations of IBD statistics and covariates for relative pairs within the same pedigree. README and package sources are provided. [12/2006]


This software offers routines to handle family data with a pedigree object.

The initial purpose was to create correlation structures that describe family relationships, such as kinship and identity-by-descent, which can be used to model family data in mixed effects models, such as in the coxme function. This also includes a tool for pedigree drawing, which is focused on producing compact layouts without intervention.

Recent additions include utilities to trim the pedigree object with various criteria. Terry Therneau, Beth Atkinson, Dan Schaid, Jason Sinnwell, Shannon McDonnell and Martha Matsumoto. [updated December 2013]


Check Pedigrees for Mendelian Errors and, when errors are found, systematically jackknifes every typed pedigree member to determine if eliminating this member will remove all Mendelian Errors from the pedigree. [02/2011]


Package that calculates a truncated exact test for two-stage case-control studies for rare genetic variants. The first stage is for screening rare variants in only cases. If the number of case-carriers of any rare variants exceeds a user-specified threshold, then additional cases and controls are genotyped for the detected variants and carrier status of these variants are compared for all cases and controls in the second stage. Distributed as an R package and as a stand-alone program in C. The R package contains an additional function to calculate an optimal 2-stage design. [04/2010]

More software from the Division of Biomedical Statistics and Informatics