An Effective Pruning based Outlier Detection Method to Quantify the Outliers

Kamal Malik; Harsh Sadawarti; G.S. Kalra

doi:10.19026/rjaset.9.1402

Research Journal of Applied Sciences, Engineering and Technology

Research Article | OPEN ACCESS

An Effective Pruning based Outlier Detection Method to Quantify the Outliers

¹Kamal Malik, ²Harsh Sadawarti and ³G.S. Kalra

¹MMICT and BM, MMU, Mullana, Haryana
²RIMTIET (Affiliated to Punjab Technical University)
³Lovely Professional University, Punjab, India

Research Journal of Applied Sciences, Engineering and Technology 2015 4:257-261

http://dx.doi.org/10.19026/rjaset.9.1402 | © The Author(s) 2015

Received: July ‎18, ‎2014 | Accepted: October ‎17, 2014 | Published: February 05, 2015

Back to issue | PDF | HTML

Abstract

Outliers are the data objects that do not conform to the normal behaviour and usually deviates from the remaining data objects may be due to some outlying property which distinguishes them from the whole dataset. Usually, the detection of outliers is followed by the clustering of the dataset which sometimes ignores the prominency of outliers. In this study, we have tried to detect the outliers and pruned the clustering elements initially so that the outliers can be prominently highlighted. We have proposed an algorithm which effectively prunes the similar data objects from the large datasets and its experimental results compare the neighbouring points and show the better performance than the existing methods.

Keywords:

Clusters , distance-based, pruning,

References

Angiulli, F. and C. Pizzuti, 2005. Outlier mining in large high-dimensional data sets. IEEE T. Knowl. Data En., 17: 203-215.
CrossRef
Angiulli, F., S. Basta and C. Pizzuti, 2006. Distance-based detection and prediction of outliers. IEEE T. Knowl. Data En., 18(2): 145-160.
CrossRef
Barnett, V. and T. Lewis, 1994. Outliers in Statistical Data. John Wiley and Sons, New York.
Breunig, M.M., H.P. Kriegel, R.T. Ng and J. Sander, 2000. LOF: Identifying density-based local outliers. SIGMOD Rec., 29(2): 93-104.
CrossRef
Guha, S., R. Rastogi and K. Shim, 1998. CURE: An efficient clustering algorithm for large databases. SIGMOD Rec., 27(2): 73-84.
CrossRef
Knorr, E.M. and R.T. Ng, 1998. Algorithms for mining distance-based outliers in large datasets. Proceeding of 24th International Conference on Very Large Data Bases (VLDB, 1998), pp: 392-403.
Ng, R.T. and J. Han, 1994. Efficient and effective clustering methods for spatial data mining. Proceeding of the 20th International Conference on Very Large Data Bases (VLDB, 1994). Santiago, Chile, pp: 144-155.
Pamula, R., J.K. Deka and S. Nandi, 2011. An outlier detection method based on clustering. Proceeding of 2nd International Conference on Emerging Applications of Information Technology, pp: 253-256.
CrossRef
Ramaswamy, S., R. Rastogi and K. Shim, 2000. Efficient algorithms for mining outliers from large data sets. Proceeding of the 2000 ACM SIGMOD International Conference on Management of Data (SIGMOD '00), pp: 427-438.
CrossRef PMid:10870986
Tucakov, V., E.M. Knorr and R.T. Ng, 2000. Distance-based outliers: algorithms and applications. VLDB J., 8(3-4): 237-253.
CrossRef
Zhang, K., M. Hutter and H. Jin, 2009. A new local distance-based outlier detection approach for scattered real-world data. Proceeding of the 13th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining (PAKDD �09), pp: 813-822.
CrossRef
Zhang, T., R. Ramakrishnan and M. Livny, 1996. Birch: An efficient data clustering method for very large databases. SIGMOD Rec., 25(2): 103-114.
CrossRef

Competing interests

The authors have no competing interests.

Open Access Policy

This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Copyright

The authors have no competing interests.

ISSN (Online): 2040-7467
ISSN (Print): 2040-7459

Information

Sales & Services



Journal Home \| Aim & Scope \| Author(s) Information \| Editorial Board \| MSP Download Statistics