• ParaDIME: Genome-wide differential DNA methylation analyses using next generation sequencing

      Pabla, Sarabjot; Institute of Molecular Medicine and Genetics (12/27/2016)
      Epigenetic modifications are key players in the regulation of a plethora of cellular and physiological processes. DNA methylation is one of the most widely studied epigenetic modification. Genomic abnormalities in DNA methylation have been implicated in various complex diseases including cancer and autoimmunity. With advent of next generation sequencing, investigating DNA methylation patterns at genome-wide scale has become increasingly feasible. However, the pace of developing appropriate statistical methods to analyze large scale DNA methylation data has been slower. This can be attributed to both statistical and computational challenges faced by current methods. In order to overcome these statistical and computational shortcomings, we developed ParaDIME, a web application for differential DNA methylation analysis. ParaDIME tests CpG dinucleotide sites or pre-defined regions of CpG sites for differential DNA methylation using Rao-Scott chi squared test. ParaDIME not only uses a nonparametric test that accounts for differential sequencing coverage but also uses permutation testing to compute exact p values. In order to overcome computation challenges of large amount of permutations, we use parallel computing to share the workload and decrease execution time significantly. To test ParaDIME in-silico, we initially simulated bisulfitesequencing data and tested it against two most widely used methods: MethylSig and MethylKit. It performed equal or better at accurately detecting differentially methylated regions than both the methods. Especially, at important, low differences of percent methylation, ParaDIME performed better than existing tools. In order to test ParaDIME’s ability to detect biologically relevant differentially methylation regions (DMRs), it was then tested on publically available methylation data from chronic lymphocytic leukemia patients. Our method was able to detect previously known and experimentally verified DMR in CLL, especially DMRs located in Nfatc1 and FOXA2 genes. Additionally, it was able to detect other DMRs in genes present in caner related pathways. Due to ParaDIME’s ability to detect biologically relevant DMRs, we employed it in an integrative analysis study to identify epigenetically regulated genes in Sjogren’s syndrome mouse model, B6.NOD aec1/aec2. We performed reduced representation bisulfite sequencing and RNA sequencing on salivary glands of four and eighteen weeks old B6.NOD aec1/aec2 compared to age and gender matched C57BL/6 mice. After removing age and mouse model effect, we discovered 89 differentially expressed as well as differentially methylated genes. Spearman rank order correlation analysis found a significant correlation between DNA methylation and gene expression. Autoimmunity related genes Klf9 and Nfkbid showed significant negative correlation whereas, other genes like Fgf12 and Coll11a2 genes showed significant positive correlation. Subnetwork enrichment using MATISSE showed three jointly active connected subnetworks that were highly enriched in Immune system related pathways, especially, T cell and B cell activation along with cytokine signaling and endocrine system development. Evidence presented in this report presents a novel and a robust differential DNA methylation analysis method with high accuracy to detect disease-relevant DMRs. ParaDIME is a user-friendly and scalable web application with appropriate test statistic to analyze large-scale DNA methylation studies.