Bayesian Functional Clustering and VMR Identification in Methylation Microarray Data

Hdl Handle:
http://hdl.handle.net/10675.2/581419
Title:
Bayesian Functional Clustering and VMR Identification in Methylation Microarray Data
Authors:
Campbell, Jeff
Abstract:
The study of the relation between DNA and health and disease has had a lot of time, energy, and money invested in it over the years. As more scientific knowledge has accumulated, it has become clear that the relations between DNA and health isn’t just a function of the sequence of nucleotide bases, but also on permanent modifications of DNA that affect DNA transcriptions and thus have a macroscopic effect on an individual. The study of modifications to DNA is known as epigenetics.Epigenetic changes have been shown to play a role in certain diseases, including cancer (Novak 2004). Finding locations of differential methylation in two groups of cells is an ongoing area of research in both science and bioinformatics. The number of developed statistical methods for establishing differential DNA methylation between two groups is limited (Bock 2012). Many developed methods are developed for nextgeneration sequencing data and may not work for microarray data, and vice versa. Bisulfite sequencing, the next-generation sequencing technique for attaining methylation data, often comes with limited sample size and considerations must be made for low and variable coverage, and smoothing the methylation values. The analysis of nextgeneration sequencing data also involves small sample sizes.In addition, these methods can be sensitive to how individual CpG regions are grouped together as a region for analysis. If the DMRs are small relative to the sizes of 5 established regions, then the method may not detect a region as having differential methylation. Robust methods for clustering microarray data have also been an ongoing area of research. It is desirable to have a method that could be applied to microarray data could increase the sample size and mitigate the previous problems if the method used is robust to missing values, outliers, and microarray data noise. Functional clustering has shown to be effective when properly conducted on gene expression data. It can be used when the data have temporal measurements to identify genes that are possibly co-expressed. The clustering of methylation data can also be shown to identify epigenetic subgroups that can potentially be very useful (Wang, 2011). [introduction]
Affiliation:
Department of Biostatistics and Epidemiology
Issue Date:
Jul-2015
URI:
http://hdl.handle.net/10675.2/581419
Type:
Dissertation
Appears in Collections:
Theses and Dissertations

Full metadata record

DC FieldValue Language
dc.contributor.authorCampbell, Jeffen
dc.date.accessioned2015-10-29T13:28:24Zen
dc.date.available2015-10-29T13:28:24Zen
dc.date.issued2015-07en
dc.identifier.urihttp://hdl.handle.net/10675.2/581419en
dc.description.abstractThe study of the relation between DNA and health and disease has had a lot of time, energy, and money invested in it over the years. As more scientific knowledge has accumulated, it has become clear that the relations between DNA and health isn’t just a function of the sequence of nucleotide bases, but also on permanent modifications of DNA that affect DNA transcriptions and thus have a macroscopic effect on an individual. The study of modifications to DNA is known as epigenetics.Epigenetic changes have been shown to play a role in certain diseases, including cancer (Novak 2004). Finding locations of differential methylation in two groups of cells is an ongoing area of research in both science and bioinformatics. The number of developed statistical methods for establishing differential DNA methylation between two groups is limited (Bock 2012). Many developed methods are developed for nextgeneration sequencing data and may not work for microarray data, and vice versa. Bisulfite sequencing, the next-generation sequencing technique for attaining methylation data, often comes with limited sample size and considerations must be made for low and variable coverage, and smoothing the methylation values. The analysis of nextgeneration sequencing data also involves small sample sizes.In addition, these methods can be sensitive to how individual CpG regions are grouped together as a region for analysis. If the DMRs are small relative to the sizes of 5 established regions, then the method may not detect a region as having differential methylation. Robust methods for clustering microarray data have also been an ongoing area of research. It is desirable to have a method that could be applied to microarray data could increase the sample size and mitigate the previous problems if the method used is robust to missing values, outliers, and microarray data noise. Functional clustering has shown to be effective when properly conducted on gene expression data. It can be used when the data have temporal measurements to identify genes that are possibly co-expressed. The clustering of methylation data can also be shown to identify epigenetic subgroups that can potentially be very useful (Wang, 2011). [introduction]en
dc.rightsCopyright protected. Unauthorized reproduction or use beyond the exceptions granted by the Fair Use clause of U.S. Copyright law may violate federal law.en
dc.subjectCluster Analysisen
dc.subjectDNA Methylationen
dc.subjectEpigenesis, Geneticen
dc.subjectComputational Biologyen
dc.titleBayesian Functional Clustering and VMR Identification in Methylation Microarray Dataen
dc.typeDissertationen
dc.contributor.departmentDepartment of Biostatistics and Epidemiologyen
dc.description.advisorVarghese, Georgeen
dc.description.committeeRyu, Duchwan; Xu, Hongyan; Kim, Jaejik; Shi, Huidongen
dc.description.degreeDoctor of Philosophy with a Major in Biostatisticsen
All Items in Scholarly Commons are protected by copyright, with all rights reserved, unless otherwise indicated.