• Login
    View Item 
    •   Home
    • Centers & Institutes
    • Center for Biotechnology and Genomic Medicine
    • Center for Biotechnology and Genomic Medicine: Faculty Research and Presentations
    • View Item
    •   Home
    • Centers & Institutes
    • Center for Biotechnology and Genomic Medicine
    • Center for Biotechnology and Genomic Medicine: Faculty Research and Presentations
    • View Item
    JavaScript is disabled for your browser. Some features of this site may not work without it.

    Browse

    All of Scholarly CommonsCommunitiesTitleAuthorsIssue DateSubmit DateSubjectsThis CollectionTitleAuthorsIssue DateSubmit DateSubjects

    My Account

    LoginRegister

    About

    AboutCreative CommonsAugusta University LibrariesUSG Copyright Policy

    Statistics

    Display statistics

    ParaKMeans: Implementation of a parallelized K-means algorithm suitable for general laboratory use.

    • CSV
    • RefMan
    • EndNote
    • BibTex
    • RefWorks
    Thumbnail
    Name:
    1471-2105-9-200.pdf
    Size:
    524.8Kb
    Format:
    PDF
    Download
    Authors
    Kraj, Piotr
    Sharma, Ashok
    Garge, Nikhil
    Podolsky, Robert H.
    McIndoe, Richard A
    Issue Date
    2008-04-28
    URI
    http://hdl.handle.net/10675.2/15
    
    Metadata
    Show full item record
    Abstract
    BACKGROUND: During the last decade, the use of microarrays to assess the transcriptome of many biological systems has generated an enormous amount of data. A common technique used to organize and analyze microarray data is to perform cluster analysis. While many clustering algorithms have been developed, they all suffer a significant decrease in computational performance as the size of the dataset being analyzed becomes very large. For example, clustering 10000 genes from an experiment containing 200 microarrays can be quite time consuming and challenging on a desktop PC. One solution to the scalability problem of clustering algorithms is to distribute or parallelize the algorithm across multiple computers. RESULTS: The software described in this paper is a high performance multithreaded application that implements a parallelized version of the K-means Clustering algorithm. Most parallel processing applications are not accessible to the general public and require specialized software libraries (e.g. MPI) and specialized hardware configurations. The parallel nature of the application comes from the use of a web service to perform the distance calculations and cluster assignments. Here we show our parallel implementation provides significant performance gains over a wide range of datasets using as little as seven nodes. The software was written in C# and was designed in a modular fashion to provide both deployment flexibility as well as flexibility in the user interface. CONCLUSION: ParaKMeans was designed to provide the general scientific community with an easy and manageable client-server application that can be installed on a wide variety of Windows operating systems.
    Citation
    BMC Bioinformatics. 2008 Apr 16; 9:200
    ae974a485f413a2113503eed53cd6c53
    10.1186/1471-2105-9-200
    Scopus Count
    Collections
    Center for Biotechnology and Genomic Medicine: Faculty Research and Presentations

    entitlement

    Related articles

    • TimeClust: a clustering tool for gene expression time series.
    • Authors: Magni P, Ferrazzi F, Sacchi L, Bellazzi R
    • Issue date: 2008 Feb 1
    • ParaSAM: a parallelized version of the significance analysis of microarrays algorithm.
    • Authors: Sharma A, Zhao J, Podolsky R, McIndoe RA
    • Issue date: 2010 Jun 1
    • Divisive Correlation Clustering Algorithm (DCCA) for grouping of genes: detecting varying patterns in expression profiles.
    • Authors: Bhattacharya A, De RK
    • Issue date: 2008 Jun 1
    • An improved algorithm for clustering gene expression data.
    • Authors: Bandyopadhyay S, Mukhopadhyay A, Maulik U
    • Issue date: 2007 Nov 1
    • Maximum significance clustering of oligonucleotide microarrays.
    • Authors: de Ridder D, Staal FJ, van Dongen JJ, Reinders MJ
    • Issue date: 2006 Feb 1
    DSpace software (copyright © 2002 - 2023)  DuraSpace
    Quick Guide | Contact Us
    Open Repository is a service operated by 
    Atmire NV
     

    Export search results

    The export option will allow you to export the current search results of the entered query to a file. Different formats are available for download. To export the items, click on the button corresponding with the preferred download format.

    By default, clicking on the export buttons will result in a download of the allowed maximum amount of items.

    To select a subset of the search results, click "Selective Export" button and make a selection of the items you want to export. The amount of items that can be exported at once is similarly restricted as the full export.

    After making a selection, click one of the export format buttons. The amount of items that will be exported is indicated in the bubble next to export format.