• Login
    View Item 
    •   Home
    • Colleges & Programs
    • Medical College of Georgia (MCG)
    • Department of Pathology
    • Department of Pathology: Faculty Research and Presentations
    • View Item
    •   Home
    • Colleges & Programs
    • Medical College of Georgia (MCG)
    • Department of Pathology
    • Department of Pathology: Faculty Research and Presentations
    • View Item
    JavaScript is disabled for your browser. Some features of this site may not work without it.

    Browse

    All of Scholarly CommonsCommunitiesTitleAuthorsIssue DateSubmit DateSubjectsThis CollectionTitleAuthorsIssue DateSubmit DateSubjects

    My Account

    LoginRegister

    About

    AboutCreative CommonsAugusta University LibrariesUSG Copyright Policy

    Statistics

    Display statistics

    A modified hyperplane clustering algorithm allows for efficient and accurate clustering of extremely large datasets.

    • CSV
    • RefMan
    • EndNote
    • BibTex
    • RefWorks
    Thumbnail
    Name:
    btp123.pdf
    Size:
    277.7Kb
    Format:
    PDF
    Download
    Authors
    Sharma, Ashok
    Podolsky, Robert H.
    Zhao, Jieping
    McIndoe, Richard A
    Issue Date
    2009-04-24
    URI
    http://hdl.handle.net/10675.2/108
    
    Metadata
    Show full item record
    Abstract
    MOTIVATION: As the number of publically available microarray experiments increases, the ability to analyze extremely large datasets across multiple experiments becomes critical. There is a requirement to develop algorithms which are fast and can cluster extremely large datasets without affecting the cluster quality. Clustering is an unsupervised exploratory technique applied to microarray data to find similar data structures or expression patterns. Because of the high input/output costs involved and large distance matrices calculated, most of the algomerative clustering algorithms fail on large datasets (30,000 + genes/200 + arrays). In this article, we propose a new two-stage algorithm which partitions the high-dimensional space associated with microarray data using hyperplanes. The first stage is based on the Balanced Iterative Reducing and Clustering using Hierarchies algorithm with the second stage being a conventional k-means clustering technique. This algorithm has been implemented in a software tool (HPCluster) designed to cluster gene expression data. We compared the clustering results using the two-stage hyperplane algorithm with the conventional k-means algorithm from other available programs. Because, the first stage traverses the data in a single scan, the performance and speed increases substantially. The data reduction accomplished in the first stage of the algorithm reduces the memory requirements allowing us to cluster 44,460 genes without failure and significantly decreases the time to complete when compared with popular k-means programs. The software was written in C# (.NET 1.1). AVAILABILITY: The program is freely available and can be downloaded from http://www.amdcc.org/bioinformatics/bioinformatics.aspx. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
    Citation
    Bioinformatics. 2009 May 1; 25(9):1152-1157
    ae974a485f413a2113503eed53cd6c53
    10.1093/bioinformatics/btp123
    Scopus Count
    Collections
    Department of Pathology: Faculty Research and Presentations

    entitlement

    Related articles

    • FLAME, a novel fuzzy clustering method for the analysis of DNA microarray data.
    • Authors: Fu L, Medico E
    • Issue date: 2007 Jan 4
    • Discovering biclusters in gene expression data based on high-dimensional linear geometries.
    • Authors: Gan X, Liew AW, Yan H
    • Issue date: 2008 Apr 23
    • Detecting clusters of different geometrical shapes in microarray gene expression data.
    • Authors: Kim DW, Lee KH, Lee D
    • Issue date: 2005 May 1
    • Divisive Correlation Clustering Algorithm (DCCA) for grouping of genes: detecting varying patterns in expression profiles.
    • Authors: Bhattacharya A, De RK
    • Issue date: 2008 Jun 1
    • Towards clustering of incomplete microarray data without the use of imputation.
    • Authors: Kim DW, Lee KY, Lee KH, Lee D
    • Issue date: 2007 Jan 1
    DSpace software (copyright © 2002 - 2021)  DuraSpace
    Quick Guide | Contact Us
    Open Repository is a service operated by 
    Atmire NV
     

    Export search results

    The export option will allow you to export the current search results of the entered query to a file. Different formats are available for download. To export the items, click on the button corresponding with the preferred download format.

    By default, clicking on the export buttons will result in a download of the allowed maximum amount of items.

    To select a subset of the search results, click "Selective Export" button and make a selection of the items you want to export. The amount of items that can be exported at once is similarly restricted as the full export.

    After making a selection, click one of the export format buttons. The amount of items that will be exported is indicated in the bubble next to export format.