Home            Contact us            FAQs
    
      Journal Home      |      Aim & Scope     |     Author(s) Information      |      Editorial Board      |      MSP Download Statistics

     Research Journal of Applied Sciences, Engineering and Technology


A Refined Rough K-Means Clustering Algorithm based on Minimizing the Effect of Local Outlier Objects to Improve Overlapping Detection

Khaled Ali Othman, Md. Nasir Sulaiman, Norwati Mustapha and Nurfadhlina Mohd Sharef
Department of Computer Science, Faculty of Computer Science and IT Putra University of Malaysia, Selangor, Malaysia
Research Journal of Applied Sciences, Engineering and Technology  2017  8:281-290
http://dx.doi.org/10.19026/rjaset.14.4952  |  © The Author(s) 2017
Received: December 14, 2016  |  Accepted: May 25, 2017  |  Published: August 15, 2017

Abstract

In order to improve the quality of overlapping detection, Rough K-Means (RKM) was proposed as the first kind of rough clustering algorithm. It was found that this recent RKM algorithm known as π RKM is the most powerful and effective version in which there is an increase in the number of objects that are correctly clustered and a decrease in the number objects that are incorrectly clustered compared to the issues which the previous RKM had. However, there are challenges associated with the clustering process which uses RKM as a result of the difficulty in establishing a standard measure for reducing the effect of local outlier objects on a means function. Therefore, the RKM algorithm is refined in this study to address the problem. Through this study we contribute two components. Firstly, we intend to employ the use of Local Outlier Factor (LOF) technique for the discrimination of a number of objects as outliers and secondly, we propose to reduce the effect of local outliers on means function by using a weight. The result of the experiments which were performed through the use of synthetic and real life datasets prove that there is an improvement in the quality of overlapping detection when compared to recent versions.

Keywords:

Clustering, data analysis, local outlier factor, Rough K-Means,


References

  1. Jain, A.K., 2010. Data clustering: 50 years beyond K-means. Pattern Recogn. Lett., 31(8): 651-666.
    CrossRef    
  2. Lingras, P. and C. West, 2004. Interval set clustering of web users with rough K-means. J. Intell. Inf. Syst., 23(1): 5-16.
    CrossRef    
  3. Pawlak, Z., 1982. Rough sets. Int. J. Comput. Inf. Sci., 11(5): 341-356.
    CrossRef    
  4. Anderson, E., 1935. The irises of the Gaspé Peninsula. B. Am. Iris Soc., 59(1): 2-5.
  5. Ankerst, M., M.M. Breunig, H.P. Kriegel and J. Sander, 1999. OPTICS: Ordering points to identify the clustering structure. ACM SIGMOD Record, 28(2): 49-60.
    CrossRef    
  6. Bezdek, J.C. and J.D. Harris, 1978. Fuzzy partitions and relations; an axiomatic basis for clustering. Fuzzy Set. Syst., 1(2): 111-127.
    CrossRef    
  7. Breunig, M.M., H.P. Kriegel, R.T. Ng and J. Sander, 1999. Optics-of: Identifying local outliers. In: Zytkow, J.M. and J. Rauch (Eds.), Principles of Data Mining and Knowledge Discovery. PKDD, 1999. LNCS, Springer Verlag, Berlin, Heidelberg, 1704: 262-270.
    Direct Link
  8. Breunig, M.M., H.P. Kriegel, R.T. Ng and J. Sander, 2000. LOF: Identifying density-based local outliers. Proceeding of the ACM Sigmod International Conference on Management of Data, pp: 93-104.
    CrossRef    
  9. Hartigan, J.A. and M.A. Wong, 1979. Algorithm AS 136: Algorithm AS 136: A K-means clustering algorithm. J. Roy. Stat. Soc. C, 28(1): 100-108.
    Direct Link
  10. Hawkins, D.M., 1980. Identification of Outliers. Chapman and Hall, London, Vol. 11.
    CrossRef    PMid:6898078    
  11. Jain, A.K., M.N. Murty and P.J. Flynn, 2000. Data clustering: A review. ACM Comput. Surv., 31(3): 264-323.
    CrossRef    
  12. Krishnapuram, R. and J.M. Keller, 1993. A possibilistic approach to clustering. IEEE T. Fuzzy Syst., 1(2): 98-110.
    CrossRef    
  13. Laplace, P., 1998. Philosophical Essay on Probabilities. Translated from the Fifth French Edition of 1825, Springer, Berlin.
  14. Lingras, P., 2009. Evolutionary rough K-means clustering. In: Wen, P., Y. Li, L. Polkowski, Y. Yao, S. Tsumoto and G. Wang (Eds.), Rough Sets and Knowledge Technology. RSKT 2009. Lecture Notes in Computer Science, Springer, Berlin, Heidelberg, 5589: 68-75.
    CrossRef    
  15. Lingras, P. and G. Peters, 2011. Rough clustering. Data Min. Knowl. Disc., 1(1): 64-72.
    CrossRef    
  16. Maji, P. and S.K. Pal, 2008. RFCM: A hybrid clustering algorithm using rough and fuzzy sets. Fund. Inform., 80(4): 475-496.
    Direct Link
  17. Maji, P. and S. Paul, 2012. Rough-Fuzzy C-Means for Clustering Microarray Gene Expression Data. In: Kundu, M.K., S. Mitra, D. Mazumdar and S.K. Pal (Eds.), Perception and Machine Intelligence. Lecture Notes in Computer Science, Springer, Berlin, Heidelberg, 7143: 203-210.
    CrossRef    
  18. Mitra, S., 2004. An evolutionary rough partitive clustering. Pattern Recogn. Lett., 25(12): 1439-1449.
    CrossRef    
  19. Mitra, S., H. Banka and W. Pedrycz, 2006. Rough-fuzzy collaborative clustering. IEEE T. Syst. Man Cy. B, 36(4): 795-805.
    CrossRef    
  20. Pal, S.K. and D.D. Majumder, 1977. Fuzzy sets and decisionmaking approaches in vowel and speaker recognition. IEEE T. Syst. Man Cyb., 7(8): 625-629.
    CrossRef    
  21. Peters, G., 2006. Some refinements of rough k-means clustering. Pattern Recogn., 39(8): 1481-1491.
    CrossRef    
  22. Peters, G., 2012. Rough Sets: Selected Methods and Applications in Management and Engineering. Springer, London, New York.
    CrossRef    
  23. Peters, G., 2014. Rough clustering utilizing the principle of indifference. Inform. Sciences, 277: 358-374.
    CrossRef    
  24. Peters, G., 2015a. Assessing rough classifiers. Fund. Inform., 137(4): 493-515.
    Direct Link
  25. Peters, G., 2015b. Is there any need for rough clustering? Pattern Recogn. Lett., 53: 31-37.
    CrossRef    
  26. Peters, G. and P. Lingras, 2014. Analysis of User-Weighted p Rough k-Means. In: Miao, D., W. Pedrycz, D. Sl?zak, G. Peters, Q. Hu and R. Wang (Eds.), Rough Sets and Knowledge Technology. RSKT, 2014. Lecture Notes in Computer Science, Springer, Cham, 8818: 547-556.
    CrossRef    
  27. Peters, G., M. Lampart and R. Weber, 2008. Evolutionary Rough K-Medoid Clustering. In: Peters, J.F. and A. Skowron (Eds.), Transactions on Rough Sets VIII. Lecture Notes in Computer Science, Springer-Verlag, Berlin, Heidelberg, 5084: 289-306.
    CrossRef    
  28. Peters, G., F. Crespo, P. Lingras and R. Weber, 2013. Soft clustering - Fuzzy and rough approaches and their extensions and derivatives. Int. J. Approx. Reason., 54(2): 307-322.
    CrossRef    
  29. Setyohadi, D.B., A.A. Bakar and Z.A. Othman, 2014. Rough K-means outlier factor based on entropy computation. Res. J. Appl. Sci. Eng. Technol., 8(3): 398-409.
    CrossRef    
  30. Velmurugan, T. and T. Santhanam, 2010. Computational complexity between K-means and K-medoids clustering algorithms for normal and uniform distributions of data points. J. Comput. Sci., 6(3): 363-368.
    CrossRef    
  31. Xiao, Y. and J. Yu, 2012. Partitive clustering (K-means family). Data Min. Knowl. Disc., 2(3): 209-225.
    CrossRef    

Competing interests

The authors have no competing interests.

Open Access Policy

This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Copyright

The authors have no competing interests.

ISSN (Online):  2040-7467
ISSN (Print):   2040-7459
Submit Manuscript
   Information
   Sales & Services
Home   |  Contact us   |  About us   |  Privacy Policy
Copyright © 2024. MAXWELL Scientific Publication Corp., All rights reserved