Research Article | OPEN ACCESS
A Refined Rough K-Means Clustering Algorithm based on Minimizing the Effect of Local Outlier Objects to Improve Overlapping Detection
Khaled Ali Othman, Md. Nasir Sulaiman, Norwati Mustapha and Nurfadhlina Mohd Sharef
Department of Computer Science, Faculty of Computer Science and IT Putra University of Malaysia, Selangor, Malaysia
Research Journal of Applied Sciences, Engineering and Technology 2017 8:281-290
Received: December 14, 2016 | Accepted: May 25, 2017 | Published: August 15, 2017
Abstract
In order to improve the quality of overlapping detection, Rough K-Means (RKM) was proposed as the first kind of rough clustering algorithm. It was found that this recent RKM algorithm known as π RKM is the most powerful and effective version in which there is an increase in the number of objects that are correctly clustered and a decrease in the number objects that are incorrectly clustered compared to the issues which the previous RKM had. However, there are challenges associated with the clustering process which uses RKM as a result of the difficulty in establishing a standard measure for reducing the effect of local outlier objects on a means function. Therefore, the RKM algorithm is refined in this study to address the problem. Through this study we contribute two components. Firstly, we intend to employ the use of Local Outlier Factor (LOF) technique for the discrimination of a number of objects as outliers and secondly, we propose to reduce the effect of local outliers on means function by using a weight. The result of the experiments which were performed through the use of synthetic and real life datasets prove that there is an improvement in the quality of overlapping detection when compared to recent versions.
Keywords:
Clustering, data analysis, local outlier factor, Rough K-Means,
References
- Jain, A.K., 2010. Data clustering: 50 years beyond K-means. Pattern Recogn. Lett., 31(8): 651-666.
CrossRef -
Lingras, P. and C. West, 2004. Interval set clustering of web users with rough K-means. J. Intell. Inf. Syst., 23(1): 5-16.
CrossRef
- Pawlak, Z., 1982. Rough sets. Int. J. Comput. Inf. Sci., 11(5): 341-356.
CrossRef
- Anderson, E., 1935. The irises of the Gaspé Peninsula. B. Am. Iris Soc., 59(1): 2-5.
-
Ankerst, M., M.M. Breunig, H.P. Kriegel and J. Sander, 1999. OPTICS: Ordering points to identify the clustering structure. ACM SIGMOD Record, 28(2): 49-60.
CrossRef
- Bezdek, J.C. and J.D. Harris, 1978. Fuzzy partitions and relations; an axiomatic basis for clustering. Fuzzy Set. Syst., 1(2): 111-127.
CrossRef
- Breunig, M.M., H.P. Kriegel, R.T. Ng and J. Sander, 1999. Optics-of: Identifying local outliers. In: Zytkow, J.M. and J. Rauch (Eds.), Principles of Data Mining and Knowledge Discovery. PKDD, 1999. LNCS, Springer Verlag, Berlin, Heidelberg, 1704: 262-270.
Direct Link
-
Breunig, M.M., H.P. Kriegel, R.T. Ng and J. Sander, 2000. LOF: Identifying density-based local outliers. Proceeding of the ACM Sigmod International Conference on Management of Data, pp: 93-104.
CrossRef
- Hartigan, J.A. and M.A. Wong, 1979. Algorithm AS 136: Algorithm AS 136: A K-means clustering algorithm. J. Roy. Stat. Soc. C, 28(1): 100-108.
Direct Link
- Hawkins, D.M., 1980. Identification of Outliers. Chapman and Hall, London, Vol. 11.
CrossRef PMid:6898078
-
Jain, A.K., M.N. Murty and P.J. Flynn, 2000. Data clustering: A review. ACM Comput. Surv., 31(3): 264-323.
CrossRef
- Krishnapuram, R. and J.M. Keller, 1993. A possibilistic approach to clustering. IEEE T. Fuzzy Syst., 1(2): 98-110.
CrossRef
- Laplace, P., 1998. Philosophical Essay on Probabilities. Translated from the Fifth French Edition of 1825, Springer, Berlin.
- Lingras, P., 2009. Evolutionary rough K-means clustering. In: Wen, P., Y. Li, L. Polkowski, Y. Yao, S. Tsumoto and G. Wang (Eds.), Rough Sets and Knowledge Technology. RSKT 2009. Lecture Notes in Computer Science, Springer, Berlin, Heidelberg, 5589: 68-75.
CrossRef
-
Lingras, P. and G. Peters, 2011. Rough clustering. Data Min. Knowl. Disc., 1(1): 64-72.
CrossRef
- Maji, P. and S.K. Pal, 2008. RFCM: A hybrid clustering algorithm using rough and fuzzy sets. Fund. Inform., 80(4): 475-496.
Direct Link
- Maji, P. and S. Paul, 2012. Rough-Fuzzy C-Means for Clustering Microarray Gene Expression Data. In: Kundu, M.K., S. Mitra, D. Mazumdar and S.K. Pal (Eds.), Perception and Machine Intelligence. Lecture Notes in Computer Science, Springer, Berlin, Heidelberg, 7143: 203-210.
CrossRef
-
Mitra, S., 2004. An evolutionary rough partitive clustering. Pattern Recogn. Lett., 25(12): 1439-1449.
CrossRef
- Mitra, S., H. Banka and W. Pedrycz, 2006. Rough-fuzzy collaborative clustering. IEEE T. Syst. Man Cy. B, 36(4): 795-805.
CrossRef
- Pal, S.K. and D.D. Majumder, 1977. Fuzzy sets and decisionmaking approaches in vowel and speaker recognition. IEEE T. Syst. Man Cyb., 7(8): 625-629.
CrossRef
-
Peters, G., 2006. Some refinements of rough k-means clustering. Pattern Recogn., 39(8): 1481-1491.
CrossRef
-
Peters, G., 2012. Rough Sets: Selected Methods and Applications in Management and Engineering. Springer, London, New York.
CrossRef
-
Peters, G., 2014. Rough clustering utilizing the principle of indifference. Inform. Sciences, 277: 358-374.
CrossRef
- Peters, G., 2015a. Assessing rough classifiers. Fund. Inform., 137(4): 493-515.
Direct Link
- Peters, G., 2015b. Is there any need for rough clustering? Pattern Recogn. Lett., 53: 31-37.
CrossRef
-
Peters, G. and P. Lingras, 2014. Analysis of User-Weighted p Rough k-Means. In: Miao, D., W. Pedrycz, D. Sl?zak, G. Peters, Q. Hu and R. Wang (Eds.), Rough Sets and Knowledge Technology. RSKT, 2014. Lecture Notes in Computer Science, Springer, Cham, 8818: 547-556.
CrossRef
- Peters, G., M. Lampart and R. Weber, 2008. Evolutionary Rough K-Medoid Clustering. In: Peters, J.F. and A. Skowron (Eds.), Transactions on Rough Sets VIII. Lecture Notes in Computer Science, Springer-Verlag, Berlin, Heidelberg, 5084: 289-306.
CrossRef
- Peters, G., F. Crespo, P. Lingras and R. Weber, 2013. Soft clustering - Fuzzy and rough approaches and their extensions and derivatives. Int. J. Approx. Reason., 54(2): 307-322.
CrossRef
- Setyohadi, D.B., A.A. Bakar and Z.A. Othman, 2014. Rough K-means outlier factor based on entropy computation. Res. J. Appl. Sci. Eng. Technol., 8(3): 398-409.
CrossRef
- Velmurugan, T. and T. Santhanam, 2010. Computational complexity between K-means and K-medoids clustering algorithms for normal and uniform distributions of data points. J. Comput. Sci., 6(3): 363-368.
CrossRef
-
Xiao, Y. and J. Yu, 2012. Partitive clustering (K-means family). Data Min. Knowl. Disc., 2(3): 209-225.
CrossRef
Competing interests
The authors have no competing interests.
Open Access Policy
This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
Copyright
The authors have no competing interests.
|
|
|
ISSN (Online): 2040-7467
ISSN (Print): 2040-7459 |
|
Information |
|
|
|
Sales & Services |
|
|
|