Cooperative Clustering Missing Data Imputation
Conference Proceedings - IEEE International Conference on Systems, Man and Cybernetics
Cooperative Clustering, Imputation, Missing Data, V2X Communication Data
Missing data imputation is a critical part of data cleaning tasks and vital for learning from incomplete data. This paper proposes a novel cooperative clustering imputation (CCI) method to estimate missing values. The proposed method aims to find a better clustering model and donor for imputation, comparing with individual clustering algorithms. It makes use of agreements among different clustering algorithms to generate a set of sub-clusters, and, then, merges these sub-clusters based on the matrix of the performance measures of sub-clusters. The proposed method is evaluated using ten public datasets from UCI data repository and V2X communication data with induced missing samples, and compared with three standard clustering based imputation methods, k-means imputation, fuzzy c-means imputation, and partition around medoids imputation. Missing values are induced through each dataset by different missing mechanisms, missing rates, and missing distribution, and, thus, various incomplete datasets are generated. The performance of these methods are checked using normalized root mean square error (NRMSE). The attained experimental results indicate the effectiveness of the proposed missing values imputation method.
Wan, Daoming; Razavi-Far, Roozbeh; and Saif, Mehrdad. (2020). Cooperative Clustering Missing Data Imputation. Conference Proceedings - IEEE International Conference on Systems, Man and Cybernetics, 2020-October, 1039-1045.