Yun ZhuFollow

Date of Award

Spring 5-11-2015

Degree Type


Degree Name

Doctor of Philosophy (PhD)


Computer Science

First Advisor

Yanqing Zhang

Second Advisor

Yi Pan

Third Advisor

Rafal Angryk

Fourth Advisor

Yichuan Zhao


Many recent works have shown that ensemble methods yield better generalizability over single classifier approach by aggregating the decisions of all base learners in machine learning tasks. To address the redundancy and inaccuracy issues with the base learners in ensemble methods, classifier/ensemble selection methods have been proposed to select one single classifier or an ensemble (a subset of all base learners) to classify a query pattern. This final classifier or ensemble is determined either statically before prediction or dynamically for every query pattern during prediction. Static selection approaches select classifier and ensemble by evaluating classifiers in terms of accuracy and diversity. While dynamic classifier/ensemble selection (DCS, DES) methods incorporate local information for a dedicated classifier/ensemble to each query pattern. Our work focuses on DES by proposing a new DES framework — DES with Regional Expertise (DES-RE).

The success of a DES system lies in two factors: the quality of base learners and the optimality of ensemble selection. DES-RE proposed in our work addresses these two challenges respectively. 1) Local expertise enhancement. A novel data sampling and weighting strategy that combines the advantages of bagging and boosting is employed to increase the local expertise of the base learners in order to facilitate the later ensemble selection. 2) Competence region optimization. DES-RE tries to learn a distance metric to form better competence regions (aka neighborhood) that promote strong base learners with respect to a specific query pattern. In addition to perform local expertise enhancement and competence region optimization independently, we proposed an expectation–maximization (EM) framework that combines the two procedures. For all the proposed algorithms, extensive simulations are conducted to validate their performances.