Date of Award

5-2-2007

Degree Type

Closed Dissertation

Degree Name

Doctor of Philosophy (PhD)

Department

Computer Science

First Advisor

Rajshekhar Sunderraman - Chair

Second Advisor

Paul S. Katz

Third Advisor

Yanqing Zhang

Fourth Advisor

Ying Zhu

Abstract

Traditional data management technologies originating from business domain are currently facing many challenges from other domains such as scientific research. Data structures in databases are becoming more and more complex and data query functions are moving from the back-end database level towards the front-end user-interface level. Traditional query languages such as SQL, OQL, and form-based query interfaces cannot fully meet the needs today. This research is motivated by the data management issues in life science applications. I propose a methodology for domain-specific conceptual data modeling and querying. The methodology can be applied to any domain to capture more domain semantics and empower end-users to formulate a query at the conceptual level with terminologies and functions familiar to them. The query system resulting from the methodology is designed to work on all major types of database management systems (DBMS) and support end-users to dynamically define and add new domain-specific functions. That is, all user-defined functions can be either pre-defined by domain experts and/or data model creators at the time of system creation, or dynamically defined by end-users from the client side at any time. The methodology has a domain-specific conceptual data model (DSC-DM) and a domain-specific conceptual query language (DSC-QL). DSC-QL uses only the abstract concepts, relationships, and functions defined in DSC-DM. It is a user-oriented high level query language and intentionally designed to be flexible, extensible, and readily usable. DSC-QL queries are much simpler than corresponding SQL or OQL queries because of advanced features such as user-defined functions, composite and set attributes, dot-path expressions, and super-classes. DSC-QL can be translated into SQL and OQL through a dynamic mapping function, and automatically updated when the underlying database schema evolves. The operational and declarative semantics of DSC-QL are formally defined in terms of graphs. A normal form for DSC-QL as a standard format for the mappings from flexible conceptual expressions to restricted SQL or OQL statements is also defined. Two translation algorithms from normalized DSC-QL to SQL and OQL are introduced. Through comparison, DSC-QL is shown to have very good balance between simplicity and expressive power and is suitable for end-users. Implementation details of the query system are reported as well. Two prototypes have been built. One prototype is for neuroscience domain, which is built on an object-oriented DBMS. The other one is for traditional business domain, which is built on a relational DBMS.

Share

COinS