Date of Award

8-6-2007

Degree Type

Dissertation

Degree Name

Doctor of Philosophy (PhD)

Department

Computer Science

First Advisor

Guantao Chen - Chair

Second Advisor

Yi Pan - Co-Chair

Third Advisor

Rajshekhar Sunderraman

Fourth Advisor

Jenny J. Yang

Abstract

Most questions in proteomics require complex answers. Yet graph theory, supervised learning, and statistical model have decomposed complex questions into simple questions with simple answers. The expertise in the field of protein study often address tasks that demand answers as complex as the questions. Such complex answers may consist of multiple factors that must be weighed against each other to arrive at a globally satisfactory and consistent solution to the question. In the prediction of calcium binding in proteins, we construct a global oxygen contact graph of a protein, then apply a graph algorithm to find oxygen clusters with the fixed size of four, finally employ a geometry algorithm to judge if the oxygen clusters are calcium-binding sites or not. Additionally, we can predict the locations of those sites. Furthermore, we construct a global oxygen contact graph including oxygen-bonded carbon atoms of a protein, then apply a graph algorithm to find local biggest oxygen clusters, finally design another geometric filter to exclude the non-calcium binding oxygen clusters. In addition, we apply observed chemical properties as a chemical filter to recognize some non-calcium binding oxygen clusters. In order to explore the characteristics of calcium-binding sites in proteins, we conduct a statistic survey on four datasets derived from 1994 to 2005 about the geometric parameters and chemical properties of calcium-binding sites. In the prediction of disulfide bond connectivity, we analyze protein sequences to predict the folding of proteins relative to the cystines using nearest neighboring methods. we extend a new pattern-wise method to all available template proteins, and find global pattern of pairing cysteines with a new descriptor of cysteine separation profile on protein secondary structure.

Share

COinS