Date of Award

11-27-2007

Degree Type

Dissertation

Degree Name

Doctor of Philosophy (PhD)

Department

Computer Science

First Advisor

Yan-Qing Zhang - Co-Chair

Second Advisor

Robert Harrison - Co-Chair

Third Advisor

Yichuan Zhao

Fourth Advisor

Rajshekhar Sunderraman

Abstract

The generalization abilities of machine learning algorithms often depend on the algorithms’ initialization, parameter settings, training sets, or feature selections. For instance, SVM classifier performance largely relies on whether the selected kernel functions are suitable for real application data. To enhance the performance of individual classifiers, this dissertation proposes classifier fusion models using computational intelligence knowledge to combine different classifiers. The first fusion model called T1FFSVM combines multiple SVM classifiers through constructing a fuzzy logic system. T1FFSVM can be improved by tuning the fuzzy membership functions of linguistic variables using genetic algorithms. The improved model is called GFFSVM. To better handle uncertainties existing in fuzzy MFs and in classification data, T1FFSVM can also be improved by applying type-2 fuzzy logic to construct a type-2 fuzzy classifier fusion model (T2FFSVM). T1FFSVM, GFFSVM, and T2FFSVM use accuracy as a classifier performance measure. AUC (the area under an ROC curve) is proved to be a better classifier performance metric. As a comparison study, AUC-based classifier fusion models are also proposed in the dissertation. The experiments on biomedical datasets demonstrate promising performance of the proposed classifier fusion models comparing with the individual composing classifiers. The proposed classifier fusion models also demonstrate better performance than many existing classifier fusion methods. The dissertation also studies one interesting phenomena in biology domain using machine learning and classifier fusion methods. That is, how protein structures and sequences are related each other. The experiments show that protein segments with similar structures also share similar sequences, which add new insights into the existing knowledge on the relation between protein sequences and structures: similar sequences share high structure similarity, but similar structures may not share high sequence similarity.

Share

COinS