A survey of statistical methods for gene-gene interaction in case-control genome-wide association studies


  • Mathieu Emily


Over the last few years, case-control genome-wide association studies (GWAS) have proven to be a successful tool to identify genomic regions associated with complex diseases. Nevertheless, current GWAS still heavily rely on a single-marker strategy, in which each biological marker (or SNP for single nucleotide polymorphism) is tested individually for association with the disease. However, it is widely admitted that this is an oversimplified approach to tackle the complexity of underlying biological mechanisms and gene-gene interaction must be considered. Unfortunately, gene-gene interaction detection gives rise to complex statistical challenges, arising from the highdimensionality and the complex architecture of the data as well as the size of the space of interaction models. The purpose of this survey is to provide a critical overview of the numerous statistical methods proposed to detect gene-gene interaction detection in GWAS. Those methods have been developed to detect interaction at various scales of the data and we decompose our survey in three main classes: SNP-SNP interaction methods, Gene-Gene interaction methods and large-scale methods. For each class of methods, we identify relative strengths and weaknesses in terms of statistical power and provide perspectives to the future of statistical strategies in gene-gene interaction analysis.