Date of Award

12-4-2006

Degree Type

Thesis

Degree Name

Master of Science (MS)

Department

Computer Science

First Advisor

Dr. Raj Sunderraman - Chair

Second Advisor

Dr. Ying Zhu

Third Advisor

Dr. Saeid Belkasim

Abstract

There is a lot of research work being performed on indexing the Web. More and more sophisticated Web crawlers are been designed to search and index the Web faster. But all these traditional crawlers crawl only the part of Web we call “Surface Web”. They are unable to crawl the hidden portion of the Web. These traditional crawlers retrieve contents only from surface Web pages which are just a set of Web pages linked by some hyperlinks and ignoring the hidden information. Hence, they ignore tremendous amount of information hidden behind these search forms in Web pages. Most of the published research has been done to detect such searchable forms and make a systematic search over these forms. Our approach here will be based on a Web crawler that analyzes search forms and fills tem with appropriate content to retrieve maximum relevant information from the database.

DOI

https://doi.org/10.57709/1059377

Recommended Citation

Pandya, Milan, "A Domain Based Approach to Crawl the Hidden Web." Thesis, Georgia State University, 2006.
doi: https://doi.org/10.57709/1059377

Download

Included in

Computer Sciences Commons

COinS

Computer Science Theses

A Domain Based Approach to Crawl the Hidden Web

Date of Award

Degree Type

Degree Name

Department

First Advisor

Second Advisor

Third Advisor

Abstract

DOI

Recommended Citation

Included in

Browse

Authors

Computer Science Theses

A Domain Based Approach to Crawl the Hidden Web

Author

Date of Award

Degree Type

Degree Name

Department

First Advisor

Second Advisor

Third Advisor

Abstract

DOI

Recommended Citation

Included in

Share

Browse

Authors