Skip to content

Association_tests

Summary Table

NAME CATEGORY CITATION
CC-GWAS Case-case GWAS Peyrot, W. J., & Price, A. L. (2021). Identifying loci with different allele frequencies among cases of eight psychiatric disorders using CC-GWAS. Nature genetics, 53(4), 445-454.
TrajGWAS GWAS of longitudinal trajectories Ko, S., German, C. A., Jensen, A., Shen, J., Wang, A., Mehrotra, D. V., ... & Zhou, J. J. (2022). GWAS of longitudinal trajectories at biobank scale. The American Journal of Human Genetics, 109(3), 433-445.
GWAX GWAS using family history Liu, J. Z., Erlich, Y., & Pickrell, J. K. (2017). Case–control association mapping by proxy using family history of disease. Nature genetics, 49(3), 325-331.
LT-FH GWAS using family history Liu, J. Z., Erlich, Y., & Pickrell, J. K. (2017). Case–control association mapping by proxy using family history of disease. Nature genetics, 49(3), 325-331.
SiblingGWAS GWAS using family history Howe, L. J., Nivard, M. G., Morris, T. T., Hansen, A. F., Rasheed, H., Cho, Y., ... & Davies, N. M. (2022). Within-sibship genome-wide association analyses decrease bias in estimates of direct genetic effects. Nature genetics, 54(5), 581-592.
snipar GWAS using family history Young, A. I., Nehzati, S. M., Benonisdottir, S., Okbay, A., Jayashankar, H., Lee, C., ... & Kong, A. (2022). Mendelian imputation of parental genotypes improves estimates of direct genetic effects. Nature genetics, 54(6), 897-905.

Guan, J., Nehzati, S. M., Benjamin, D. J., & Young, A. I. (2022). Novel estimators for family-based genome-wide association studies increase power and robustness. bioRxiv, 2022-10.
REGENIE Gene-based analysis (rare variant) Mbatchou, Joelle, et al. "Computationally efficient whole-genome regression for quantitative and binary traits." Nature genetics 53.7 (2021): 1097-1103.
SAIGE-GENE+ Gene-based analysis (rare variant) Zhou, Wei, et al. "SAIGE-GENE+ improves the efficiency and accuracy of set-based rare variant association tests." Nature Genetics (2022): 1-4.
SAIGE-GENE Gene-based analysis (rare variant) Zhou, Wei, et al. "Scalable generalized linear mixed model for region-based association tests in large biobanks and cohorts." Nature genetics 52.6 (2020): 634-639.
SKAT-O Gene-based analysis (rare variant) Lee, Seunggeun, Michael C. Wu, and Xihong Lin. "Optimal tests for rare variant effects in sequencing association studies." Biostatistics 13.4 (2012): 762-775.
SKAT Gene-based analysis (rare variant) Wu, Michael C., et al. "Rare-variant association testing for sequencing data with the sequence kernel association test." The American Journal of Human Genetics 89.1 (2011): 82-93.
STAAR Gene-based analysis (rare variant) Li, Xihao, et al. "Dynamic incorporation of multiple in silico functional annotations empowers rare variant association analysis of large whole-genome sequencing studies at scale." Nature genetics 52.9 (2020): 969-983.
PGS-adjusted GWAS PGS-adjusted GWAS Campos, A. I., Namba, S., Lin, S. C., Nam, K., Sidorenko, J., Wang, H., ... & Yengo, L. (2023). Boosting the power of genome-wide association studies within and across ancestries by using polygenic scores. Nature Genetics, 1-8.
PGS-adjusted RVATs PGS-adjusted GWAS Jurgens, S. J., Pirruccello, J. P., Choi, S. H., Morrill, V. N., Chaffin, M., Lubitz, S. A., ... & Ellinor, P. T. (2023). Adjusting for common variant polygenic scores improves yield in rare variant association analyses. Nature Genetics, 55(4), 544-548.
Review-Povysil Review Povysil, Gundula, et al. "Rare-variant collapsing analyses for complex traits: guidelines and applications." Nature Reviews Genetics 20.12 (2019): 747-759.
BOLT-lMM Single variant association tests Loh, Po-Ru, et al. "Efficient Bayesian mixed-model analysis increases association power in large cohorts." Nature genetics 47.3 (2015): 284-290.
EMMAX Single variant association tests
GEMMA Single variant association tests Zhou, Xiang, and Matthew Stephens. "Genome-wide efficient mixed-model analysis for association studies." Nature genetics 44.7 (2012): 821-824.
PLINK2 Single variant association tests Chang, Christopher C., et al. "Second-generation PLINK: rising to the challenge of larger and richer datasets." Gigascience 4.1 (2015): s13742-015.
PLINK Single variant association tests Purcell, Shaun, et al. "PLINK: a tool set for whole-genome association and population-based linkage analyses." The American journal of human genetics 81.3 (2007): 559-575.
POLMM Single variant association tests Bi, W., Zhou, W., Dey, R., Mukherjee, B., Sampson, J. N., & Lee, S. (2021). Efficient mixed model approach for large-scale genome-wide association studies of ordinal categorical phenotypes. The American Journal of Human Genetics, 108(5), 825-839.
REGENIE Single variant association tests Mbatchou, Joelle, et al. "Computationally efficient whole-genome regression for quantitative and binary traits." Nature genetics 53.7 (2021): 1097-1103.
SAIGE Single variant association tests Zhou, Wei, et al. "Efficiently controlling for case-control imbalance and sample relatedness in large-scale genetic association studies." Nature genetics 50.9 (2018): 1335-1341.
fastGWA-GLMM Single variant association tests Jiang, Longda, et al. "A generalized linear mixed model association tool for biobank-scale data." Nature genetics 53.11 (2021): 1616-1621.
fastGWA Single variant association tests Jiang, Longda, et al. "A resource-efficient tool for mixed model association analysis of large-scale data." Nature genetics 51.12 (2019): 1749-1755.

Case-case GWAS

CC-GWAS

  • NAME : CC-GWAS
  • FULL NAME : case–case genome-wide association study
  • SHORT NAME : CC-GWAS
  • DESCRIPTION : The CCGWAS R package provides a tool for case-case association testing of two different disorders based on their respective case-control GWAS results
  • URL : https://github.com/wouterpeyrot/CCGWAS
  • CITATION : Peyrot, W. J., & Price, A. L. (2021). Identifying loci with different allele frequencies among cases of eight psychiatric disorders using CC-GWAS. Nature genetics, 53(4), 445-454.
  • CATEGORY : Case-case GWAS

GWAS of longitudinal trajectories

TrajGWAS

  • NAME : TrajGWAS
  • FULL NAME : GWAS of longitudinal trajectories
  • SHORT NAME : TrajGWAS
  • DESCRIPTION : TrajGWAS.jl is a Julia package for performing genome-wide association studies (GWAS) for continuous longitudinal phenotypes using a modified linear mixed effects model. It builds upon the within-subject variance estimation by robust regression (WiSER) method and can be used to identify variants associated with changes in the mean and within-subject variability of the longitduinal trait.
  • URL : https://github.com/OpenMendel/TrajGWAS.jl
  • CITATION : Ko, S., German, C. A., Jensen, A., Shen, J., Wang, A., Mehrotra, D. V., ... & Zhou, J. J. (2022). GWAS of longitudinal trajectories at biobank scale. The American Journal of Human Genetics, 109(3), 433-445.
  • CATEGORY : GWAS of longitudinal trajectories
  • KEYWORDS : biomarker trajectories, mean, within-subject (WS) variability, linear mixed effect model, within-subject variance estimation by robust regression (WiSER) method
  • YEAR : 2022

GWAS using family history

GWAX

  • NAME : GWAX
  • FULL NAME : genome-wide association by proxy
  • SHORT NAME : GWAX
  • DESCRIPTION : In randomly ascertained cohorts, replacing cases with their first-degree relatives enables studies of diseases that are absent (or nearly absent) in the cohort.
  • CITATION : Liu, J. Z., Erlich, Y., & Pickrell, J. K. (2017). Case–control association mapping by proxy using family history of disease. Nature genetics, 49(3), 325-331.
  • CATEGORY : GWAS using family history

LT-FH

  • NAME : LT-FH
  • FULL NAME : liability threshold model, conditional on case–control status and family history
  • SHORT NAME : LT-FH
  • DESCRIPTION : an association method based on posterior mean genetic liabilities under a liability threshold model, conditional on case-control status and family history (LT-FH)
  • URL : https://alkesgroup.broadinstitute.org/UKBB/LTFH/
  • CITATION : Liu, J. Z., Erlich, Y., & Pickrell, J. K. (2017). Case–control association mapping by proxy using family history of disease. Nature genetics, 49(3), 325-331.
  • CATEGORY : GWAS using family history

SiblingGWAS

  • NAME : SiblingGWAS
  • FULL NAME : Within-sibship genome-wide association analyses
  • SHORT NAME : SiblingGWAS
  • DESCRIPTION : Scripts for running GWAS using siblings to estimate Within-Family (WF) and Between-Family (BF) effects of genetic variants on continuous traits. Allows the inclusion of more than two siblings from one family.
  • URL : https://github.com/LaurenceHowe/SiblingGWAS
  • CITATION : Howe, L. J., Nivard, M. G., Morris, T. T., Hansen, A. F., Rasheed, H., Cho, Y., ... & Davies, N. M. (2022). Within-sibship genome-wide association analyses decrease bias in estimates of direct genetic effects. Nature genetics, 54(5), 581-592.
  • CATEGORY : GWAS using family history
  • YEAR : 2022

snipar

  • NAME : snipar
  • FULL NAME : single nucleotide imputation of parents
  • SHORT NAME : snipar
  • DESCRIPTION : snipar (single nucleotide imputation of parents) is a Python package for inferring identity-by-descent (IBD) segments shared between siblings, imputing missing parental genotypes, and for performing family based genome-wide association and polygenic score analyses using observed and/or imputed parental genotypes.
  • URL : https://github.com/AlexTISYoung/snipar
  • CITATION : Young, A. I., Nehzati, S. M., Benonisdottir, S., Okbay, A., Jayashankar, H., Lee, C., ... & Kong, A. (2022). Mendelian imputation of parental genotypes improves estimates of direct genetic effects. Nature genetics, 54(6), 897-905. Guan, J., Nehzati, S. M., Benjamin, D. J., & Young, A. I. (2022). Novel estimators for family-based genome-wide association studies increase power and robustness. bioRxiv, 2022-10.
  • CATEGORY : GWAS using family history
  • YEAR : 2022

Gene-based analysis (rare variant)

REGENIE

  • NAME : REGENIE
  • FULL NAME : REGENIE
  • SHORT NAME : REGENIE
  • DESCRIPTION : regenie is a C++ program for whole genome regression modelling of large genome-wide association studies. It is developed and supported by a team of scientists at the Regeneron Genetics Center.
  • URL : https://github.com/rgcgithub/regenie
  • CITATION : Mbatchou, Joelle, et al. "Computationally efficient whole-genome regression for quantitative and binary traits." Nature genetics 53.7 (2021): 1097-1103.
  • CATEGORY : Gene-based analysis (rare variant)
  • KEYWORDS : whole genome regression

SAIGE-GENE

  • NAME : SAIGE-GENE
  • FULL NAME : SAIGE-GENE
  • SHORT NAME : SAIGE-GENE
  • URL : https://github.com/weizhouUMICH/SAIGE
  • CITATION : Zhou, Wei, et al. "Scalable generalized linear mixed model for region-based association tests in large biobanks and cohorts." Nature genetics 52.6 (2020): 634-639.
  • CATEGORY : Gene-based analysis (rare variant)

SAIGE-GENE+

  • NAME : SAIGE-GENE+
  • FULL NAME : SAIGE-GENE+
  • SHORT NAME : SAIGE-GENE+
  • URL : https://github.com/weizhouUMICH/SAIGE
  • CITATION : Zhou, Wei, et al. "SAIGE-GENE+ improves the efficiency and accuracy of set-based rare variant association tests." Nature Genetics (2022): 1-4.
  • CATEGORY : Gene-based analysis (rare variant)

SKAT

  • NAME : SKAT
  • FULL NAME : sequence kernel association test
  • SHORT NAME : SKAT
  • DESCRIPTION : SKAT is a SNP-set (e.g., a gene or a region) level test for association between a set of rare (or common) variants and dichotomous or quantitative phenotypes, SKAT aggregates individual score test statistics of SNPs in a SNP set and efficiently computes SNP-set level p-values, e.g. a gene or a region level p-value, while adjusting for covariates, such as principal components to account for population stratification. SKAT also allows for power/sample size calculations for designing for sequence association studies.
  • URL : https://www.hsph.harvard.edu/skat/
  • CITATION : Wu, Michael C., et al. "Rare-variant association testing for sequencing data with the sequence kernel association test." The American Journal of Human Genetics 89.1 (2011): 82-93.
  • CATEGORY : Gene-based analysis (rare variant)
  • KEY WORDS :

SKAT-O

  • NAME : SKAT-O
  • FULL NAME : sequence kernel association test - optimal test
  • SHORT NAME : SKAT-O
  • DESCRIPTION : estimating the correlation parameter in the kernel matrix to maximize the power, which corresponds to the estimated weight in the linear combination of the burden test and SKAT test statistics that maximizes power.
  • URL : https://www.hsph.harvard.edu/skat/
  • CITATION : Lee, Seunggeun, Michael C. Wu, and Xihong Lin. "Optimal tests for rare variant effects in sequencing association studies." Biostatistics 13.4 (2012): 762-775.
  • CATEGORY : Gene-based analysis (rare variant)

STAAR

  • NAME : STAAR
  • FULL NAME : variant-set test for association using annotation information
  • SHORT NAME : STAAR
  • DESCRIPTION : STAAR is an R package for performing variant-Set Test for Association using Annotation infoRmation (STAAR) procedure in whole-genome sequencing (WGS) studies. STAAR is a general framework that incorporates both qualitative functional categories and quantitative complementary functional annotations using an omnibus multi-dimensional weighting scheme. STAAR accounts for population structure and relatedness, and is scalable for analyzing large WGS studies of continuous and dichotomous traits.
  • URL : https://github.com/xihaoli/STAAR
  • CITATION : Li, Xihao, et al. "Dynamic incorporation of multiple in silico functional annotations empowers rare variant association analysis of large whole-genome sequencing studies at scale." Nature genetics 52.9 (2020): 969-983.
  • CATEGORY : Gene-based analysis (rare variant)
  • KEYWORDS : functional annotations

PGS-adjusted GWAS

PGS-adjusted GWAS

  • NAME : PGS-adjusted GWAS
  • FULL NAME : PGS-adjusted GWAS
  • SHORT NAME : PGS-adjusted GWAS
  • DESCRIPTION : adjustment of GWAS analyses for polygenic scores (PGSs) increases the statistical power for discovery across all ancestries
  • CITATION : Campos, A. I., Namba, S., Lin, S. C., Nam, K., Sidorenko, J., Wang, H., ... & Yengo, L. (2023). Boosting the power of genome-wide association studies within and across ancestries by using polygenic scores. Nature Genetics, 1-8.
  • CATEGORY : PGS-adjusted GWAS
  • KEYWORDS : LOCO-PGSs, two-stage meta-analysis strategy
  • YEAR : 2023

PGS-adjusted RVATs

  • NAME : PGS-adjusted RVATs
  • FULL NAME : PGS-adjusted rare variant association tests
  • SHORT NAME : PGS-adjusted RVATs
  • DESCRIPTION : adjusting for common variant polygenic scores improves yield in gene-based rare variant association tests
  • CITATION : Jurgens, S. J., Pirruccello, J. P., Choi, S. H., Morrill, V. N., Chaffin, M., Lubitz, S. A., ... & Ellinor, P. T. (2023). Adjusting for common variant polygenic scores improves yield in rare variant association analyses. Nature Genetics, 55(4), 544-548.
  • CATEGORY : PGS-adjusted GWAS
  • KEYWORDS : PGS, Rare variants
  • YEAR : 2023

Review

Review-Povysil

  • NAME : Review-Povysil
  • CITATION : Povysil, Gundula, et al. "Rare-variant collapsing analyses for complex traits: guidelines and applications." Nature Reviews Genetics 20.12 (2019): 747-759.
  • CATEGORY : Review

Single variant association tests

BOLT-lMM

  • NAME : BOLT-lMM
  • FULL NAME : BOLT-lMM
  • SHORT NAME : BOLT-lMM
  • DESCRIPTION : The BOLT-LMM software package currently consists of two main algorithms, the BOLT-LMM algorithm for mixed model association testing, and the BOLT-REML algorithm for variance components analysis (i.e., partitioning of SNP-heritability and estimation of genetic correlations).
  • URL : https://alkesgroup.broadinstitute.org/BOLT-LMM/BOLT-LMM_manual.html
  • CITATION : Loh, Po-Ru, et al. "Efficient Bayesian mixed-model analysis increases association power in large cohorts." Nature genetics 47.3 (2015): 284-290.
  • CATEGORY : Single variant association tests
  • KEYWORDS : non-infinitesimal model, mixture of two Gaussian distributions

EMMAX

  • NAME : EMMAX
  • FULL NAME : efficient mixed-model association eXpedited
  • SHORT NAME : EMMAX
  • DESCRIPTION : EMMAX is a statistical test for large scale human or model organism association mapping accounting for the sample structure. In addition to the computational efficiency obtained by EMMA algorithm, EMMAX takes advantage of the fact that each loci explains only a small fraction of complex traits, which allows us to avoid repetitive variance component estimation procedure, resulting in a significant amount of increase in computational time of association mapping using mixed model.
  • URL : https://genome.sph.umich.edu/wiki/EMMAX
  • CITATION :
  • CATEGORY : Single variant association tests

GEMMA

  • NAME : GEMMA
  • FULL NAME : genome-wide efficient mixed-model association
  • SHORT NAME : GEMMA
  • DESCRIPTION : GEMMA is the software implementing the Genome-wide Efficient Mixed Model Association algorithm for a standard linear mixed model and some of its close relatives for genome-wide association studies (GWAS). It fits a standard linear mixed model (LMM) to account for population stratification and sample structure for single marker association tests. It fits a Bayesian sparse linear mixed model (BSLMM) using Markov chain Monte Carlo (MCMC) for estimating the proportion of variance in phenotypes explained (PVE) by typed genotypes (i.e. chip heritability), predicting phenotypes, and identifying associated markers by jointly modeling all markers while controlling for population structure. It is computationally efficient for large scale GWAS and uses freely available open-source numerical libraries.
  • URL : http://stephenslab.uchicago.edu/software.html#gemma
  • CITATION : Zhou, Xiang, and Matthew Stephens. "Genome-wide efficient mixed-model analysis for association studies." Nature genetics 44.7 (2012): 821-824.
  • CATEGORY : Single variant association tests
  • NAME : PLINK
  • FULL NAME : PLINK
  • SHORT NAME : PLINK
  • DESCRIPTION : A Tool Set for Whole-Genome Association and Population-Based Linkage Analyses. PLINK is a free, open-source whole genome association analysis toolset, designed to perform a range of basic, large-scale analyses in a computationally efficient manner. The focus of PLINK is purely on analysis of genotype/phenotype data, so there is no support for steps prior to this (e.g. study design and planning, generating genotype or CNV calls from raw data). Through integration with gPLINK and Haploview, there is some support for the subsequent visualization, annotation and storage of results.
  • URL : https://www.cog-genomics.org/plink/
  • CITATION : Purcell, Shaun, et al. "PLINK: a tool set for whole-genome association and population-based linkage analyses." The American journal of human genetics 81.3 (2007): 559-575.
  • CATEGORY : Single variant association tests

PLINK2

  • NAME : PLINK2
  • FULL NAME : PLINK2
  • SHORT NAME : PLINK2
  • URL : https://www.cog-genomics.org/plink/2.0/
  • CITATION : Chang, Christopher C., et al. "Second-generation PLINK: rising to the challenge of larger and richer datasets." Gigascience 4.1 (2015): s13742-015.
  • CATEGORY : Single variant association tests

POLMM

  • NAME : POLMM
  • FULL NAME : proportional odds logistic mixed model (POLMM)
  • SHORT NAME : POLMM
  • DESCRIPTION : Proportional Odds Logistic Mixed Model (POLMM) for ordinal categorical data analysis
  • URL : https://github.com/WenjianBI/POLMM
  • CITATION : Bi, W., Zhou, W., Dey, R., Mukherjee, B., Sampson, J. N., & Lee, S. (2021). Efficient mixed model approach for large-scale genome-wide association studies of ordinal categorical phenotypes. The American Journal of Human Genetics, 108(5), 825-839.
  • CATEGORY : Single variant association tests
  • KEY WORDS : ordinal categorical phenotypes

REGENIE

  • NAME : REGENIE
  • FULL NAME : REGENIE
  • SHORT NAME : REGENIE
  • DESCRIPTION : regenie is a C++ program for whole genome regression modelling of large genome-wide association studies. It is developed and supported by a team of scientists at the Regeneron Genetics Center.
  • URL : https://github.com/rgcgithub/regenie
  • CITATION : Mbatchou, Joelle, et al. "Computationally efficient whole-genome regression for quantitative and binary traits." Nature genetics 53.7 (2021): 1097-1103.
  • CATEGORY : Single variant association tests
  • KEY WORDS : whole genome regression

SAIGE

  • NAME : SAIGE
  • FULL NAME : Scalable and Accurate Implementation of GEneralized mixed model
  • SHORT NAME : SAIGE
  • DESCRIPTION : SAIGE is an R package with Scalable and Accurate Implementation of Generalized mixed model (Chen, H. et al. 2016). It accounts for sample relatedness and is feasible for genetic association tests in large cohorts and biobanks (N > 400,000). SAIGE performs single-variant association tests for binary traits and quantitative taits. For binary traits, SAIGE uses the saddlepoint approximation (SPA)(mhof, J. P. , 1961; Kuonen, D. 1999; Dey, R. et.al 2017) to account for case-control imbalance.
  • URL : https://github.com/weizhouUMICH/SAIGE
  • CITATION : Zhou, Wei, et al. "Efficiently controlling for case-control imbalance and sample relatedness in large-scale genetic association studies." Nature genetics 50.9 (2018): 1335-1341.
  • CATEGORY : Single variant association tests
  • KEYWORDS : case-control imbalance, saddlepoint approximation (SPA)

fastGWA

  • NAME : fastGWA
  • FULL NAME : fastGWA
  • SHORT NAME : fastGWA
  • URL : https://yanglab.westlake.edu.cn/software/gcta/#fastGWA
  • CITATION : Jiang, Longda, et al. "A resource-efficient tool for mixed model association analysis of large-scale data." Nature genetics 51.12 (2019): 1749-1755.
  • CATEGORY : Single variant association tests
  • KEY WORDS : grid-search-based REML algorithm

fastGWA-GLMM

  • NAME : fastGWA-GLMM
  • FULL NAME : fastGWA-GLMM
  • SHORT NAME : fastGWA-GLMM
  • URL : https://yanglab.westlake.edu.cn/software/gcta/#fastGWA
  • CITATION : Jiang, Longda, et al. "A generalized linear mixed model association tool for biobank-scale data." Nature genetics 53.11 (2021): 1616-1621.
  • CATEGORY : Single variant association tests