Determination of similarity threshold in clustering problems for large data sets Article uri icon

abstract

  • A new automatic method based on an intra-cluster criterion, to obtain a similarity threshold that generates a well-defined clustering (or near to it) for large data sets, is proposed. This method uses the connected component criterion, and it neither calculates nor stores the similarity matrix of the objects in main memory. The proposed method is focussed on unsupervised Logical Combinatorial Pattern Recognition approach. In addition, some experimentations of the new method with large data sets are presented. © Springer-Verlag Berlin Heidelberg 2003.

publication date

  • 2003-01-01