Poster Presentation Australian Epigenetics Alliance Conference 2022

Improving analyses of summary statistics from association studies by detecting data heterogeneity and errors (#121)

Wenhan Chen 1
  1. Garvan Institute of Medical Research, Darlinghurst, NSW, Australia

Summary data from association studies, such as genome-wide association or Expression quantitative trait locus studies, have facilitated the development of various summary-data-based methods. However, analyses using these methods can suffer from biases caused by low data quality and data heterogeneity when multiple data sets are used. Here a quality control method is proposed, which leverages linkage disequilibrium among SNPs to detect and eliminate errors in summary data and heterogeneity between data sets to improve summary-data-based analyses. We showed by simulation that our method substantially reduced the false positive rate in detecting secondary association signals in summary-data-based conditional and joint association (COJO) analysis, enabling the application of COJO to rare variants. We further showcased that our method as a QC step can improve heritability estimation and putative causal gene discovery. The method has been implemented in a freely available software tool DENTIST.