Home            Contact us            FAQs
    
      Journal Home      |      Aim & Scope     |     Author(s) Information      |      Editorial Board      |      MSP Download Statistics

     Research Journal of Applied Sciences, Engineering and Technology


Hybrid Algorithm for Clustering Gene Expression Data

1S. Jacophine Susmi, 1H. Khanna Nehemiah, 2A. Kannan and 1G. Saranya
1Ramanujan Computing Centre, Anna University, Chennai 600025, India
2Department of Information Science and Technology, Anna University, Chennai 600025, India
Research Journal of Applied Sciences, Engineering and Technology  2015  7:692-700
http://dx.doi.org/10.19026/rjaset.11.2032  |  © The Author(s) 2015
Received: February ‎26, ‎2015  |  Accepted: March ‎14, ‎2015  |  Published: November 05, 2015

Abstract

Microarray gene expressions provide an insight into genomic biomarkers that aid in identifying cancerous cells and normal cells. In this study, functionally related genes are identified by partitioning gene data. Clustering is an unsupervised learning technique that partition gene data into groups based on the similarity between their expression profiles. This identifies functionally related genes. In this study, a hybrid framework is designed that uses adaptive pillar clustering algorithm and genetic algorithm. A first step towards, the proposed work is the utilization of clustering technique by adaptive pillar clustering algorithm that finds cluster centroids. The centroids and its clustering elements are calculated by average mean of pairwise inner distance. The output of adaptive pillar clustering algorithm results in number of clusters which is given as input to genetic algorithm. The microarray gene expression data set considered as input is given to adaptive pillar clustering algorithm that partitions gene data into given number of clusters so that the intra-cluster similarity is maximized and inter cluster similarity is minimized. Then for each combination of clustered gene expression, the optimum cluster is found out using genetic algorithm. The genetic algorithm initializes the population with set of clusters obtained from adaptive pillar clustering algorithm. Best chromosomes with maximum fitness are selected from the selection pool to perform genetic operations like crossover and mutation. The genetic algorithm is used to search optimum clusters based on its designed fitness function. The fitness function designed minimizes the intra cluster distance and maximizes the fitness value by tailoring a parameter that includes the weightage for diseased genes. The performance of adaptive pillar algorithm was compared with existing techniques such as k-means and pillar k-means algorithm. The clusters obtained from adaptive pillar clustering algorithm achieve a maximum cluster gain of 894.84, 812.4 and 756 for leukemia, lung and thyroid gene expression data, respectively. Further, the optimal cluster obtained by hybrid framework achieves cluster accuracy of 81.3, 80.2 and 78.2 for leukemia, lung and thyroid gene expression data respectively.

Keywords:

Adaptive pillar algorithm, average pairwise inner distance, clustering, genetic algorithm, microarray gene expression data,


References


Competing interests

The authors have no competing interests.

Open Access Policy

This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Copyright

The authors have no competing interests.

ISSN (Online):  2040-7467
ISSN (Print):   2040-7459
Submit Manuscript
   Information
   Sales & Services
Home   |  Contact us   |  About us   |  Privacy Policy
Copyright © 2024. MAXWELL Scientific Publication Corp., All rights reserved