Home            Contact us            FAQs
    
      Journal Home      |      Aim & Scope     |     Author(s) Information      |      Editorial Board      |      MSP Download Statistics

     Research Journal of Applied Sciences, Engineering and Technology


An Effective Pruning based Outlier Detection Method to Quantify the Outliers

1Kamal Malik, 2Harsh Sadawarti and 3G.S. Kalra
1MMICT and BM, MMU, Mullana, Haryana
2RIMTIET (Affiliated to Punjab Technical University)
3Lovely Professional University, Punjab, India
Research Journal of Applied Sciences, Engineering and Technology  2015  4:257-261
http://dx.doi.org/10.19026/rjaset.9.1402  |  © The Author(s) 2015
Received: July ‎18, ‎2014  |  Accepted: October ‎17, 2014  |  Published: February 05, 2015

Abstract

Outliers are the data objects that do not conform to the normal behaviour and usually deviates from the remaining data objects may be due to some outlying property which distinguishes them from the whole dataset. Usually, the detection of outliers is followed by the clustering of the dataset which sometimes ignores the prominency of outliers. In this study, we have tried to detect the outliers and pruned the clustering elements initially so that the outliers can be prominently highlighted. We have proposed an algorithm which effectively prunes the similar data objects from the large datasets and its experimental results compare the neighbouring points and show the better performance than the existing methods.

Keywords:

Clusters , distance-based, pruning,


References

  1. Angiulli, F. and C. Pizzuti, 2005. Outlier mining in large high-dimensional data sets. IEEE T. Knowl. Data En., 17: 203-215.
    CrossRef    
  2. Angiulli, F., S. Basta and C. Pizzuti, 2006. Distance-based detection and prediction of outliers. IEEE T. Knowl. Data En., 18(2): 145-160.
    CrossRef    
  3. Barnett, V. and T. Lewis, 1994. Outliers in Statistical Data. John Wiley and Sons, New York.
  4. Breunig, M.M., H.P. Kriegel, R.T. Ng and J. Sander, 2000. LOF: Identifying density-based local outliers. SIGMOD Rec., 29(2): 93-104.
    CrossRef    
  5. Guha, S., R. Rastogi and K. Shim, 1998. CURE: An efficient clustering algorithm for large databases. SIGMOD Rec., 27(2): 73-84.
    CrossRef    
  6. Knorr, E.M. and R.T. Ng, 1998. Algorithms for mining distance-based outliers in large datasets. Proceeding of 24th International Conference on Very Large Data Bases (VLDB, 1998), pp: 392-403.
  7. Ng, R.T. and J. Han, 1994. Efficient and effective clustering methods for spatial data mining. Proceeding of the 20th International Conference on Very Large Data Bases (VLDB, 1994). Santiago, Chile, pp: 144-155.
  8. Pamula, R., J.K. Deka and S. Nandi, 2011. An outlier detection method based on clustering. Proceeding of 2nd International Conference on Emerging Applications of Information Technology, pp: 253-256.
    CrossRef    
  9. Ramaswamy, S., R. Rastogi and K. Shim, 2000. Efficient algorithms for mining outliers from large data sets. Proceeding of the 2000 ACM SIGMOD International Conference on Management of Data (SIGMOD '00), pp: 427-438.
    CrossRef    PMid:10870986    
  10. Tucakov, V., E.M. Knorr and R.T. Ng, 2000. Distance-based outliers: algorithms and applications. VLDB J., 8(3-4): 237-253.
    CrossRef    
  11. Zhang, K., M. Hutter and H. Jin, 2009. A new local distance-based outlier detection approach for scattered real-world data. Proceeding of the 13th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining (PAKDD ’09), pp: 813-822.
    CrossRef    
  12. Zhang, T., R. Ramakrishnan and M. Livny, 1996. Birch: An efficient data clustering method for very large databases. SIGMOD Rec., 25(2): 103-114.
    CrossRef    

Competing interests

The authors have no competing interests.

Open Access Policy

This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Copyright

The authors have no competing interests.

ISSN (Online):  2040-7467
ISSN (Print):   2040-7459
Submit Manuscript
   Information
   Sales & Services
Home   |  Contact us   |  About us   |  Privacy Policy
Copyright © 2024. MAXWELL Scientific Publication Corp., All rights reserved