Research Article | OPEN ACCESS
Distance Based Hybrid Approach for Cluster Analysis Using Variants of K-means and Evolutionary Algorithm
O.A. Mohamed Jafar and R. Sivakumar
Department of Computer Science, A.V.V.M. Sri Pushpam College (Autonomous), Poondi, Thanjavur, Tamil Nadu, India
Research Journal of Applied Sciences, Engineering and Technology 2014 11:1355-1362
Received: June 14, 2014 | Accepted: July 09, 2014 | Published: September 20, 2014
Abstract
Clustering is a process of grouping same objects into a specified number of clusters. K-means and K-medoids algorithms are the most popular partitional clustering techniques for large data sets. However, they are sensitive to random selection of initial centroids and are fall into local optimal solution. K-means++ algorithm has good convergence rate than other algorithms. Distance metric is used to find the dissimilarity between objects. Euclidean distance metric is commonly used by number of researchers in most algorithms. In recent years, Evolutionary algorithms are the global optimization techniques for solving clustering problems. In this study, we present hybrid K-means++ with PSO technique (K++_PSO) clustering algorithm based on different distance metrics like City Block and Chebyshev. The algorithms are tested on four popular benchmark data sets from UCI machine learning repository and an artificial data set. The clustering results are evaluated through the fitness function values. We have made a comparative study of proposed algorithm with other algorithms. It has been found that K++_PSO algorithm using Chebyshev distance metric produces good clustering results as compared to other approaches.
Keywords:
Cluster analysis , distance metrics, evolutionary algorithms , K-means , K-means++, K-medoids , particle swarm optimization,
References
-
Aghdasi, T., J. Vahidi and H. Motameni, 2014. K-harmonic means data clustering using combination of particle swarm optimization and tabu search. Int. J. Mech. Electr. Comput. Technol., 4(11): 485-501.
-
Arthur, D. and S. Vassilvitskii, 2007. K-means++: The advantages of careful seeding. Proceeding of the 18th Annual ACM-SIAM Symposium on Discrete Algorithms, pp: 1027-1035.
-
Bandyopadhyay, S. and U. Maulik, 2002. An evolutionary technique based on K-means algorithm for optimal clustering in Rn. Inform. Sci., 146: 221-237.
CrossRef
-
Berkhin, P., 2002. Survey of clustering data mining techniques. Technical Report, Accrue Software, San Jose, California.
-
Chen, C.Y. and Y. Fun, 2004. Particle swarm optimization algorithm and its application to clustering analysis. Proceeding of IEEE International Conference on Networking Sensing and Control, 2: 789-794.
-
Chuang, L.Y., Y.D. Lin and C.H. Yang, 2012. An improved particle swarm optimization for data clustering. Proceeding of International MultiConference of Engineers and Computer Scientists (IMECS, 2012). Hong Kong, Vol. 1, March 14-16.
-
Danesh, M., M. Naghibzadeh, M.R.A. Totonchi, M. Danesh, B. Minaei and H. Shirgahi, 2011. Data clustering based on an efficient hybrid of K-harmonic means, PSO and GA. In: Nguyen, N.T. (Ed.), Transactions on CCI IV. LNCS 6660, Springer-Verlag, Berlin, Heidlberg, pp: 125-140.
CrossRef
-
Dong, J. and M. Qi, 2009. A new algorithm for clustering based on particle swarm optimization and K-means. Proceeding of International Conference on Artificial Intelligence and Computational Intelligence (AICI'09), pp: 264-268.
CrossRef
-
Eberhart, R.C. and Y. Shi, 2001. Particle swarm optimization: Developments, applications and resources. Proceeding of the 2001 Congress on Evolutionary Computation, 1: 81-86.
CrossRef
-
Esmin, A.A.A., D.L. Pereira and F. de Araujo, 2008. Study of different approach to clustering data by using the particle swarm optimization algorithm. Proceeding of the IEEE World Congress on Evolutionary Computation, pp: 1817-1822.
CrossRef
-
Gan, G., C. Ma and J. Wu, 2007. Data Clustering: Theory, Algorithms and Applications. SIAM, Philadelphia, PA.
-
Han, J. and M. Kamber, 2001. Data Mining: Concepts and Techniques. Morgan Kaufmann, San Francisco. Retrieved form: http:// archive.ics.uci.edu/ml/.
-
Jain, A. and R. Dubes, 1998. Algorithms for Clustering Data. Prentice Hall, New Jersey.
-
Kao, Y. and S.Y. Lee, 2009. Combining K-means and particle swarm optimization for dynamic data clustering problems. Proceeding of the IEEE International Conference on Intelligent Computing and Intelligent System, pp: 757-761.
CrossRef
-
Kaufman, L. and P.J. Rousseeuw, 1990. Finding Groups in Data: An Introduction to Cluster Analysis. John Wiley and Sons Inc., New York.
CrossRef
-
Kennedy, J. and R. Eberhart, 1995. Particle swarm optimization. Proceeding of IEEE International Conference on Neural Networks. Piscataway, NJ, 4: 1942-1948.
CrossRef -
Li, Y.R., Z.Y. Yong and Z.C. Na, 2013. The K-means clustering algorithm based on chaos particle swarm. J. Theor. Appl. Inform. Technol., 48(2): 762-767.
-
Liu, Y., J. Peng, K. Chen and Y. Zhang, 2006. An improved hybrid genetic clustering algorithm. In: Antoniou, G. et al. (Eds.), SETN 2006. LNAI 3955, Springer-Verlag, Berlin, Heidlberg, pp: 192-202.
CrossRef
-
MacQueen, J., 1967. Some methods for classification and analysis of multivariate observations. Proceeding of the 5th Berkeley Symposium on Mathematical Statistics and Probability, pp: 281-297.
-
Mohamed Jafar, O.A. and R. Sivakumar, 2013. A study of bio-inspired algorithm to data clustering using different distance measures. Int. J. Comput. Appl. (IJCA), 66(12): 33-44.
-
Niknam, T. and B. Amiri, 2010. An efficient hybrid approach based on PSO, ACO and k-means for cluster analysis. Appl. Soft Comput., 10(1): 183-197.
CrossRef
-
Omran, M., A. Salman and A.P. Engelbrecht, 2002. Image classification using particle swarm optimization. Proceeding of the 4th Asia-Pacific Conference on Simulated Evolution and Learning. Singapore, pp: 370-374.
-
Poli, R., J. Kennedy and T. Blackwell, 2007. Particle swarm optimization-an overview. Swarm Intell., 1(1): 33-57.
CrossRef
-
Rana, S., S. Jasola and R. Kumar, 2010. A hybrid sequential approach for data clustering using K-means and particle swarm optimization algorithm. Int. J. Eng. Sci. Technol., 2(6): 167-176.
-
Sethi, C. and G. Mishra, 2013. A linear PCA based hybrid K-means PSO algorithm for clustering large dataset. Int. J. Sci. Eng. Res., 4(6): 1559-1566.
-
Tsai, C.Y. and I.W. Kao, 2010. Particle swarm optimization with selective particle regeneration for data clustering. Expert Syst. Appl., 38: 6565-6576.
CrossRef
-
Van Der Merwe, D.W. and A.P. Engelbrecht, 2003. Data clustering using particle swarm optimization. Proceeding of the IEEE Congress on Evolutionary Computation. Canberra, Australia, pp: 215-220.
CrossRef
-
Xu, R. and D. Wunsch II, 2005. Survey of clustering algorithms. IEEE T. Neural Networ., 16(3): 645-678.
CrossRef PMid:15940994
-
Yang, F., T. Sun and C. Zhang, 2009. An efficient hybrid data clustering method based on K-harmonic means and particle swarm optimization. Expert Syst. Appl., 36: 9847-9852.
CrossRef
-
Ye, F. and C.Y. Chen, 2005. Alternative KPSO-clustering algorithm. Tamkang J. Sci. Eng., 8(2): 165-174.
-
Yu, X. and M. Gen, 2010. Introduction to Evolutionary Algorithms. Springer, London.
CrossRef
Competing interests
The authors have no competing interests.
Open Access Policy
This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
Copyright
The authors have no competing interests.
|
|
|
ISSN (Online): 2040-7467
ISSN (Print): 2040-7459 |
|
Information |
|
|
|
Sales & Services |
|
|
|