Date of Award

Spring 5-7-2011

Degree Type

Dissertation

Degree Name

Doctor of Philosophy (PhD)

Department

Computer Science

First Advisor

Yanqing Zhang

Abstract

Computational linguistics is a sub-field of artificial intelligence; it is an interdisciplinary field dealing with statistical and/or rule-based modeling of natural language from a computational perspective. Traditionally, fuzzy logic is used to deal with fuzziness among single linguistic terms in documents. However, linguistic terms may be related to other types of uncertainty. For instance, different users search ‘cheap hotel’ in a search engine, they may need distinct pieces of relevant hidden information such as shopping, transportation, weather, etc. Therefore, this research work focuses on studying granular words and developing new algorithms to process them to deal with uncertainty globally. To precisely describe the granular words, a new structure called Granular Information Hyper Tree (GIHT) is constructed. Furthermore, several technologies are developed to cooperate with computing with granular words in spam filtering and query recommendation. Based on simulation results, the GIHT-Bayesian algorithm can get more accurate spam filtering rate than conventional method Naive Bayesian and SVM; computing with granular word also generates better recommendation results based on users’ assessment when applied it to search engine.

Share

COinS