Cooperative Clustering Missing Data Imputation

Document Type

Conference Proceeding

Publication Date

10-11-2020

Publication Title

Conference Proceedings - IEEE International Conference on Systems, Man and Cybernetics

Volume

2020-October

First Page

1039

Keywords

Cooperative Clustering, Imputation, Missing Data, V2X Communication Data

Last Page

1045

Abstract

Missing data imputation is a critical part of data cleaning tasks and vital for learning from incomplete data. This paper proposes a novel cooperative clustering imputation (CCI) method to estimate missing values. The proposed method aims to find a better clustering model and donor for imputation, comparing with individual clustering algorithms. It makes use of agreements among different clustering algorithms to generate a set of sub-clusters, and, then, merges these sub-clusters based on the matrix of the performance measures of sub-clusters. The proposed method is evaluated using ten public datasets from UCI data repository and V2X communication data with induced missing samples, and compared with three standard clustering based imputation methods, k-means imputation, fuzzy c-means imputation, and partition around medoids imputation. Missing values are induced through each dataset by different missing mechanisms, missing rates, and missing distribution, and, thus, various incomplete datasets are generated. The performance of these methods are checked using normalized root mean square error (NRMSE). The attained experimental results indicate the effectiveness of the proposed missing values imputation method.

DOI

10.1109/SMC42975.2020.9283484

ISSN

1062922X

ISBN

9781728185262

Share

COinS