Standardization and Its Effects on K-Means Clustering Algorithm

Ismail Bin Mohamad; Dauda Usman

doi:10.19026/rjaset.6.3638

Research Journal of Applied Sciences, Engineering and Technology

Research Article | OPEN ACCESS

Standardization and Its Effects on K-Means Clustering Algorithm

Ismail Bin Mohamad and Dauda Usman

Department of Mathematical Sciences, Faculty of Science, Universiti Teknologi Malaysia, 81310, UTM Johor Bahru, Johor Darul Ta

Research Journal of Applied Sciences, Engineering and Technology 2013 17:3299-3303

http://dx.doi.org/10.19026/rjaset.6.3638 | © The Author(s) 2013

Received: January 23, 2013 | Accepted: February 25, 2013 | Published: September 20, 2013

Back to issue | PDF | HTML

Abstract

Data clustering is an important data exploration technique with many applications in data mining. K-means is one of the most well known methods of data mining that partitions a dataset into groups of patterns, many methods have been proposed to improve the performance of the K-means algorithm. Standardization is the central preprocessing step in data mining, to standardize values of features or attributes from different dynamic range into a specific range. In this paper, we have analyzed the performances of the three standardization methods on conventional K-means algorithm. By comparing the results on infectious diseases datasets, it was found that the result obtained by the z-score standardization method is more effective and efficient than min-max and decimal scaling standardization methods.

Keywords:

Clustering, decimal scaling, k-means, min-max, standardization, z-score,

References

Competing interests

The authors have no competing interests.

Open Access Policy

This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Copyright

The authors have no competing interests.

ISSN (Online): 2040-7467
ISSN (Print): 2040-7459

Information

Sales & Services



Journal Home \| Aim & Scope \| Author(s) Information \| Editorial Board \| MSP Download Statistics