Quantcast

Clustering method based on data division and partition

Research paper by Zhi-mao Lu, Chen Liu, S. Massinanke, Chun-xiang Zhang, Lei Wang

Indexed on: 01 Mar '14Published on: 01 Mar '14Published in: Journal of Central South University



Abstract

Many classical clustering algorithms do good jobs on their prerequisite but do not scale well when being applied to deal with very large data sets (VLDS). In this work, a novel division and partition clustering method (DP) was proposed to solve the problem. DP cut the source data set into data blocks, and extracted the eigenvector for each data block to form the local feature set. The local feature set was used in the second round of the characteristics polymerization process for the source data to find the global eigenvector. Ultimately according to the global eigenvector, the data set was assigned by criterion of minimum distance. The experimental results show that it is more robust than the conventional clusterings. Characteristics of not sensitive to data dimensions, distribution and number of nature clustering make it have a wide range of applications in clustering VLDS.