Date of Award

Spring 4-18-2011

Degree Type

Thesis

Degree Name

Master of Science (MS)

Department

Mathematics and Statistics

First Advisor

Dr. Yixin Fang

Second Advisor

Dr. Gengsheng Qin

Third Advisor

Dr. Ruiyan Luo

Abstract

Selecting the number of clusters is one of the greatest challenges in clustering analysis. In this thesis, we propose a variety of stability selection criteria based on cross validation for determining the number of clusters. Clustering stability measures the agreement of clusterings obtained by applying the same clustering algorithm on multiple independent and identically distributed samples. We propose to measure the clustering stability by the correlation between two clustering functions. These criteria are motivated by the concept of clustering instability proposed by Wang (2010), which is based on a form of clustering distance. In addition, the effectiveness and robustness of the proposed methods are numerically demonstrated on a variety of simulated and real world samples.

Included in

Mathematics Commons

Share

COinS