Toggle Main Menu Toggle Search

Open Access padlockePrints

Outlier detection method based on improved DPC algorithm and centrifugal factor

Lookup NU author(s): Dr Jichun Li

Downloads


Licence

This work is licensed under a Creative Commons Attribution 4.0 International License (CC BY 4.0).


Abstract

© 2024 Elsevier Inc.Outlier detection aims to identify data anomalies exhibiting significant deviations from normal patterns. However, existing outlier detection methods based on k-nearest neighbors often struggle with challenges such as increasing outlier counts and cluster formation issues. Additionally, selecting appropriate nearest-neighbor parameters presents a significant challenge, as researchers commonly evaluate detection accuracy across various k values. To enhance the accuracy and robustness of outlier detection, in this paper we propose an outlier detection method based on the improved DPC algorithm and centrifugal factor. Initially, we leverage k-nearest neighbors, k-reciprocal nearest neighbors, and Gaussian kernel function to determine the local density of samples, particularly addressing scenarios where the DPC algorithm struggles to identify cluster centers in sparse clusters. Subsequently, to reduce the DPC algorithm's computational complexity, we screen the samples based on mutual nearest neighbor counts and select cluster centers accordingly. Non-central points are then distributed using k-nearest neighbors, k-reciprocal nearest neighbors, and reverse k-nearest neighbors. The centrifugal factor, whose magnitude reflects the outlier degree of samples, is then computed by calculating the ratio of the local kernel density at the cluster center to that of samples. Finally, we propose a method for choosing the nearest neighbor parameter, k. To comprehensively evaluate the outlier detection performance of the proposed algorithm, we conduct experiments on 12 complex synthetic datasets and 25 public real-world datasets, comparing the results with 12 state-of-the-art outlier detection methods.


Publication metadata

Author(s): Xia H, Zhou Y, Li J, Yue X, Li J

Publication type: Article

Publication status: Published

Journal: Information Sciences

Year: 2024

Volume: 682

Print publication date: 01/11/2024

Online publication date: 27/07/2024

Acceptance date: 24/07/2024

Date deposited: 06/10/2024

ISSN (print): 0020-0255

ISSN (electronic): 1872-6291

Publisher: Elsevier Inc.

URL: https://doi.org/10.1016/j.ins.2024.121255

DOI: 10.1016/j.ins.2024.121255

ePrints DOI: 10.57711/9y0m-gq05

Data Access Statement: Data will be made available on request.


Altmetrics

Altmetrics provided by Altmetric


Funding

Funder referenceFunder name
2018GGJS079
National Natural Science Foundation of China
OSR/0550/SASC/S022
Newcastle University
Project of Cultivation Programme for Young Backbone Teachers of Higher Education Institutions in Henan Province
U1504622

Share