Hdl Handle:
http://hdl.handle.net/10675.2/91
Title:
Ranking analysis of F-statistics for microarray data.
Authors:
Tan, Yuan-De; Fornage, Myriam; Xu, Hongyan
Abstract:
BACKGROUND: Microarray technology provides an efficient means for globally exploring physiological processes governed by the coordinated expression of multiple genes. However, identification of genes differentially expressed in microarray experiments is challenging because of their potentially high type I error rate. Methods for large-scale statistical analyses have been developed but most of them are applicable to two-sample or two-condition data. RESULTS: We developed a large-scale multiple-group F-test based method, named ranking analysis of F-statistics (RAF), which is an extension of ranking analysis of microarray data (RAM) for two-sample t-test. In this method, we proposed a novel random splitting approach to generate the null distribution instead of using permutation, which may not be appropriate for microarray data. We also implemented a two-simulation strategy to estimate the false discovery rate. Simulation results suggested that it has higher efficiency in finding differentially expressed genes among multiple classes at a lower false discovery rate than some commonly used methods. By applying our method to the experimental data, we found 107 genes having significantly differential expressions among 4 treatments at <0.7% FDR, of which 31 belong to the expressed sequence tags (ESTs), 76 are unique genes who have known functions in the brain or central nervous system and belong to six major functional groups. CONCLUSION: Our method is suitable to identify differentially expressed genes among multiple groups, in particular, when sample size is small.
Citation:
BMC Bioinformatics. 2008 Mar 6; 9:142
Issue Date:
15-Apr-2008
URI:
http://hdl.handle.net/10675.2/91
DOI:
10.1186/1471-2105-9-142
PubMed ID:
18325100
PubMed Central ID:
PMC2323973
Type:
Journal Article; Research Support, N.I.H., Extramural
ISSN:
1471-2105
Appears in Collections:
Department of Biostatistics and Epidemiology: Faculty Research and Publications

Full metadata record

DC FieldValue Language
dc.contributor.authorTan, Yuan-Deen_US
dc.contributor.authorFornage, Myriamen_US
dc.contributor.authorXu, Hongyanen_US
dc.date.accessioned2010-09-24T21:59:00Zen
dc.date.available2010-09-24T21:59:00Zen
dc.date.issued2008-04-15en_US
dc.identifier.citationBMC Bioinformatics. 2008 Mar 6; 9:142en_US
dc.identifier.issn1471-2105en_US
dc.identifier.pmid18325100en_US
dc.identifier.doi10.1186/1471-2105-9-142en_US
dc.identifier.urihttp://hdl.handle.net/10675.2/91en
dc.description.abstractBACKGROUND: Microarray technology provides an efficient means for globally exploring physiological processes governed by the coordinated expression of multiple genes. However, identification of genes differentially expressed in microarray experiments is challenging because of their potentially high type I error rate. Methods for large-scale statistical analyses have been developed but most of them are applicable to two-sample or two-condition data. RESULTS: We developed a large-scale multiple-group F-test based method, named ranking analysis of F-statistics (RAF), which is an extension of ranking analysis of microarray data (RAM) for two-sample t-test. In this method, we proposed a novel random splitting approach to generate the null distribution instead of using permutation, which may not be appropriate for microarray data. We also implemented a two-simulation strategy to estimate the false discovery rate. Simulation results suggested that it has higher efficiency in finding differentially expressed genes among multiple classes at a lower false discovery rate than some commonly used methods. By applying our method to the experimental data, we found 107 genes having significantly differential expressions among 4 treatments at <0.7% FDR, of which 31 belong to the expressed sequence tags (ESTs), 76 are unique genes who have known functions in the brain or central nervous system and belong to six major functional groups. CONCLUSION: Our method is suitable to identify differentially expressed genes among multiple groups, in particular, when sample size is small.en_US
dc.rightsThe PMC Open Access Subset is a relatively small part of the total collection of articles in PMC. Articles in the PMC Open Access Subset are still protected by copyright, but are made available under a Creative Commons or similar license that generally allows more liberal redistribution and reuse than a traditional copyrighted work. Please refer to the license statement in each article for specific terms of use. The license terms are not identical for all articles in this subset.en_US
dc.subject.meshAlgorithmsen_US
dc.subject.meshComputer Simulationen_US
dc.subject.meshData Interpretation, Statisticalen_US
dc.subject.meshGene Expression Profiling / methodsen_US
dc.subject.meshModels, Geneticen_US
dc.subject.meshModels, Statisticalen_US
dc.subject.meshMultigene Family / physiologyen_US
dc.subject.meshOligonucleotide Array Sequence Analysis / methodsen_US
dc.subject.meshProteome / metabolismen_US
dc.titleRanking analysis of F-statistics for microarray data.en_US
dc.typeJournal Articleen_US
dc.typeResearch Support, N.I.H., Extramuralen_US
dc.identifier.pmcidPMC2323973en_US
dc.contributor.corporatenameDepartment of Biostatistics and Epidemiologyen_US
All Items in Scholarly Commons are protected by copyright, with all rights reserved, unless otherwise indicated.