Genome Wide Association Studies

Over the past several years, newly developed high density single nucleotide polymorphism (SNP) microarray technology has brought within reach the promise of performing genome-wide association studies (GWAS) to identify genomic mutations that are associated with a wide range of diseases. It is believed that complex diseases, e.g., diabetes, hypertension, Alzheimer’s disease and age related macular degeneration, are caused by the interaction of multiple genes and environmental factors.

The number of mathematical operations required to assess the association between multiple interacting genomic loci and disease grows exponentially with the number of interacting SNPs. Simple arithmetic calculations show that even with the most powerful supercomputers available today it is computationally impossible to perform a comprehensive test of association for 4 or more interacting SNPs when analyzing data sets for several hundred individuals and several hundred thousands SNPs.

As a result, a variety of statistical, computational, heuristic and knowledge-based approaches (often combined) must be developed to compensate for the inability to perform all potentially desirable computations. This area is becoming increasingly important as it becomes apparent that the main challenge in identifying key factors involved in various biological processes is not the limited amount of genomic data, but the ability to mine, integrate and analyze vast amounts of information generated by high-throughput technologies.