Poster Title:  Identification of Genetic Interactions in Multi-Phenotype Studies
Poster Abstract: 

As large-scale genome-wide association studies (GWAS) and meta-analyses of multiple phenotypes are becoming increasingly common, there is an increasing need to develop models and computationally efficient algorithms for joint analysis of multi-SNP and multi-phenotype data. However, in genomics, the genotyping data can be up to one million individuals and millions of genetic variants (as features), which makes the computation of epistasis an extremely heavy burden. For example, testing all possible two-way combinations in a sample of two million SNPs (as in a typical post quality control imputed dataset that used HapMap phase II or 1000 Genome reference samples for imputation) leads to approximately 2*1e12 combinations, making epistasis studies computationally expensive. 

In the past decade, several large-scale bioinformatics projects already benefit from parallelism techniques in HPC infrastructures as clusters, grids, graphics processing units(GPU), and clouds. We have developed some statistical tools based on GPU and also evidence that the use of HPC has emerged as a viable and interesting solution for biological big data. Meanwhile, since several disease-associated polymorphisms have been identified by GWAS and an increasing number of phenotypes covering more information are available,  we put our focus on studying epistasis in multi-trait studies and hope to link functional interactions between multiple traits, diseases and genetic factors (pairs of SNPs).




Poster ID:  D-6
Poster File:  PDF document HPC2018_D-6.pdf
Poster Image: 
Poster URL: