Show simple item record

dc.contributor.authorHu, Fengjiao
dc.date.accessioned2016-08-10T01:32:37Z
dc.date.available2016-08-10T01:32:37Z
dc.date.issued2016-07-07
dc.identifier.urihttp://hdl.handle.net/10675.2/618126
dc.description.abstractResearchers in genomics are increasingly interested in epigenetic factors such as DNA methylation because they play an important role in regulating gene expression without changes in the sequence of DNA. Abnormal DNA methylation is associated with many human diseases, including various types of cancer. We propose three different approaches to test for differentially methylated regions (DMRs) associated with complex traits, while accounting for correlations within and among CpG sites in the DMRs. One approach is a nonparametric method using a kernel distance statistic and the second one is a likelihood-based method using a binomial spatial scan statistic. Both of these approaches detect differential methylation regions between cases and controls along the genome. The kernel distance method uses the kernel function, while the binomial scan statistic approach uses a mixed effect model to incorporate correlations among CpG sites. Extensive simulations show that both approaches have excellent control of type I error, and both have reasonable statistical power. The binomial scan statistic approach appears to have higher power, while the kernel distance method is computationally faster. We also propose a third method under the Bayesian framework for comparing methylation rates when disease status is classified into ordinal multinomial categories (e.g., stages of cancer). The DMRs are detected using moving windows along the genome. Within each window, the Bayes factor is calculated to compare the two models corresponding to constant vs. monotonic methylation rates among the groups. As in the case of the scan statistic approach, the correlations between the sites are incorporated using a mixed effect model. Results from extensive simulation indicate that the Bayesian method is statistically valid and reasonably powerful to detect DMRs associated with disease severity. The proposed methods are demonstrated using data from a chronic lymphocytic leukemia (CLL) study.
dc.language.isoen_USen
dc.subjectDMRsen
dc.subjectbionomial scan statisticen
dc.subjectBayes factoren
dc.subjectdistance statisticen
dc.titleStatistical Methods to Detect Deferentially Methyleated Regions with Next-Generation Sequencing Dataen_US
dc.typeDissertationen
dc.description.advisorGeorge, Vargheseen
dc.description.degreeDoctor of Philosophy with a Major in Biostatisticsen
refterms.dateFOA2020-02-08T00:00:00Z
html.description.abstractResearchers in genomics are increasingly interested in epigenetic factors such as DNA methylation because they play an important role in regulating gene expression without changes in the sequence of DNA. Abnormal DNA methylation is associated with many human diseases, including various types of cancer. We propose three different approaches to test for differentially methylated regions (DMRs) associated with complex traits, while accounting for correlations within and among CpG sites in the DMRs. One approach is a nonparametric method using a kernel distance statistic and the second one is a likelihood-based method using a binomial spatial scan statistic. Both of these approaches detect differential methylation regions between cases and controls along the genome. The kernel distance method uses the kernel function, while the binomial scan statistic approach uses a mixed effect model to incorporate correlations among CpG sites. Extensive simulations show that both approaches have excellent control of type I error, and both have reasonable statistical power. The binomial scan statistic approach appears to have higher power, while the kernel distance method is computationally faster. We also propose a third method under the Bayesian framework for comparing methylation rates when disease status is classified into ordinal multinomial categories (e.g., stages of cancer). The DMRs are detected using moving windows along the genome. Within each window, the Bayes factor is calculated to compare the two models corresponding to constant vs. monotonic methylation rates among the groups. As in the case of the scan statistic approach, the correlations between the sites are incorporated using a mixed effect model. Results from extensive simulation indicate that the Bayesian method is statistically valid and reasonably powerful to detect DMRs associated with disease severity. The proposed methods are demonstrated using data from a chronic lymphocytic leukemia (CLL) study.


Files in this item

Thumbnail
Name:
Hu_Fengjiao_PhD_2016.pdf
Size:
797.5Kb
Format:
PDF

This item appears in the following Collection(s)

Show simple item record