Date of Award

8-3-2006

Degree Type

Closed Dissertation

Degree Name

Doctor of Philosophy (PhD)

Department

Computer Science

First Advisor

Dr. Saeid Belkasim - Chair

Second Advisor

Dr. Yi Pan - Co-Chair

Third Advisor

Dr. Robert Harrison

Fourth Advisor

Dr. Phang C.Tai

Abstract

Prediction of protein secondary structure from primary sequence of amino acids is a very challenging task, and the problem has been approached from several angles. Proteins have many different biological functions; they may act as enzymes or as building blocks (muscle fibers) or may have transport function (e.g., transport of oxygen). The three-dimensional protein structure determines the functional properties of the protein. A lot of interesting work has been done on this problem, and over the last 10 to 20 years the methods have gradually improved in accuracy. In this dissertation we investigate several techniques for predicting the protein secondary structure. The prediction is carried out mainly using pattern classification techniques such as neural networks, genetic algorithms, simulated annealing. Each individual algorithm may work well in certain situations but fails in others. Capitalizing on the positive decisions can be achieved by forcing the various methods to collaborate to reach a unified consensus based on their previous performances. The process of combining classifiers is called decision fusion. The various decision fusion techniques such as the committee method, correlation method and the Bayesian inference methods to fuse the solutions from various approaches and to get better prediction accuracy are thoroughly explored in this dissertation. The RS126 data set was used for training and testing purposes. The results of applying pattern classification algorithms along with decision fusion techniques showed improvement in the prediction accuracy compared to that of prediction by neural networks or pattern classification algorithms individually or combined with neural networks. This research has shown that decision fusion techniques can be used to obtain better protein secondary structure prediction accuracy.

Share

COinS