Loading...
Thumbnail Image
Item

Grid-Enabled Automatic Web Page Classification

Metikurke, Seema Sreenivasamurthy
Citations
Altmetric:
Abstract

Much research has been conducted on the retrieval and classification of web-based information. A big challenge is the performance issue, especially for a classification algorithm returning results for a large set of data that is typical when accessing the Web. This thesis describes a grid-enabled approach for automatic web page classification. The basic approach is first described that uses a vector space model (VSM). An enhancement of the approach through the use of a genetic algorithm (GA) is then described. The enhanced approach can efficiently process candidate web pages from a number of web sites and classify them. A prototype is implemented and empirical studies are conducted. The contributions of this thesis are: 1) Application of grid computing to improve performance of both VSM and GA using VSM based web page classification; 2) Improvement of the VSM classification algorithm by applying GA that uniquely discovers a set of training web pages while also generating a near optimal parameter values set for VSM.

Comments
Description
Date
2006-06-12
Journal Title
Journal ISSN
Volume Title
Publisher
Research Projects
Organizational Units
Journal Issue
Keywords
Automatic Web Page Classification, Vector Space Model, Genetic Algorithm, Grid Computing
Citation
Metikurke, Seema Sreenivasamurthy (2006). "Grid-Enabled Automatic Web Page Classification." Thesis, Georgia State University. https://doi.org/10.57709/1059368
Embargo Lift Date
2012-01-25
Embedded videos