Home           Contact us           FAQs           
     Journal Home     |     Aim & Scope    |    Author(s) Information      |     Editorial Board     |     MSP Download Statistics
2017 (Vol. 14, Issue: 8)
Research Article

Fuzzy Discretization based Classification of Medical Data

1M. Shanmugapriya, 1H. Khanna Nehemiah, 1R.S. Bhuvaneswaran, 2Kannan Arputharaj and 1J. Dhalia Sweetlin
1Ramanujan Computing Centre
2Department of Information Science and Technology, Anna University, Chennai-600025, India

DOI: 10.19026/rjaset.14.4953
Submitted Accepted Published
December 22, 2016 April 11, 2017 August 15, 2017

  How to Cite this Article:

1M. Shanmugapriya, 1H. Khanna Nehemiah, 1R.S. Bhuvaneswaran, 2Kannan Arputharaj and 1J. Dhalia Sweetlin, 2017. Fuzzy Discretization based Classification of Medical Data.  Research Journal of Applied Sciences, Engineering and Technology, 14(8): 291-298.

DOI: 10.19026/rjaset.14.4953

URL: http://www.maxwellsci.com/jp/mspabstract.php?jid=RJASET&doi=rjaset.14.4953


Discretization is one of the commonly used data preprocessing technique to improve the efficiency of the knowledge extraction process on clinical data. Generally, clinical data contains numeric attributes with continuous values. Data discretization simplifies the original data by transforming continuous data attribute values into a finite set of intervals. Although discretization is capable of handling continuous attributes on clinical data, there are cases where discretization is not an appropriate technique for handling continuous attributes. There are instances where attribute values are vague, imprecise and have multiple distributions with different classes, which challenges the process of mining in clinical data. Hence, there is a need for fuzzy discretization to pre-process the clinical data before mining. The aim of this study is to derive fuzzy discretization from crisp-interval discretization using geometric approach for constructing fuzzy sets, where overlapping region between the fuzzy sets is represented as geometric area. This study comprises of three steps: First, non-overlapping fuzzy sets are constructed using intervals generated from crisp-interval discretization. Second, area of overlapping between the fuzzy sets is computed based on the geometric approach and an average area of overlapping is estimated. Third, fuzzy sets are redesigned based on the estimated average area of overlapping. Fuzzy discretizations for three, five and seven intervals have been examined using Pima Indian Diabetes dataset (PID) and Bupa Liver Disorder dataset (BLD) taken from the University of California Irvine machine learning repository. The variation in performance of crisp and fuzzy discretization methods is measured using six classification approaches namely, tree based approach, probabilistic induction based approach, rule-based approach, network learning approach, kernel-based approach and distance-based approach and a rule-based fuzzy inference system. The results show that the classification accuracy remains stable with less deviation across different classifiers with varying intervals.

Abstract PDF HTML

    Competing interests

The authors have no competing interests.
    Open Access Policy

This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.


© The Author(s) 2017

ISSN (Online):  2040-7467
ISSN (Print):   2040-7459
Submit Manuscript
   Current Information
   Sales & Services
Home  |  Contact us  |  About us  |  Privacy Policy
Copyright © 2015. MAXWELL Scientific Publication Corp., All rights reserved