Browse by author
Lookup NU author(s): Dr Jichun Li
This work is licensed under a Creative Commons Attribution 4.0 International License (CC BY 4.0).
© 2024 Elsevier Inc.Outlier detection aims to identify data anomalies exhibiting significant deviations from normal patterns. However, existing outlier detection methods based on k-nearest neighbors often struggle with challenges such as increasing outlier counts and cluster formation issues. Additionally, selecting appropriate nearest-neighbor parameters presents a significant challenge, as researchers commonly evaluate detection accuracy across various k values. To enhance the accuracy and robustness of outlier detection, in this paper we propose an outlier detection method based on the improved DPC algorithm and centrifugal factor. Initially, we leverage k-nearest neighbors, k-reciprocal nearest neighbors, and Gaussian kernel function to determine the local density of samples, particularly addressing scenarios where the DPC algorithm struggles to identify cluster centers in sparse clusters. Subsequently, to reduce the DPC algorithm's computational complexity, we screen the samples based on mutual nearest neighbor counts and select cluster centers accordingly. Non-central points are then distributed using k-nearest neighbors, k-reciprocal nearest neighbors, and reverse k-nearest neighbors. The centrifugal factor, whose magnitude reflects the outlier degree of samples, is then computed by calculating the ratio of the local kernel density at the cluster center to that of samples. Finally, we propose a method for choosing the nearest neighbor parameter, k. To comprehensively evaluate the outlier detection performance of the proposed algorithm, we conduct experiments on 12 complex synthetic datasets and 25 public real-world datasets, comparing the results with 12 state-of-the-art outlier detection methods.
Author(s): Xia H, Zhou Y, Li J, Yue X, Li J
Publication type: Article
Publication status: Published
Journal: Information Sciences
Year: 2024
Volume: 682
Print publication date: 01/11/2024
Online publication date: 27/07/2024
Acceptance date: 24/07/2024
Date deposited: 06/10/2024
ISSN (print): 0020-0255
ISSN (electronic): 1872-6291
Publisher: Elsevier Inc.
URL: https://doi.org/10.1016/j.ins.2024.121255
DOI: 10.1016/j.ins.2024.121255
ePrints DOI: 10.57711/9y0m-gq05
Data Access Statement: Data will be made available on request.
Altmetrics provided by Altmetric