AbstractsComputer Science

Constraints and triggers to enhance XML-based data integration systems

by Jing Lu




Institution: University of Stuttgart
Department: Fakultät Informatik, Elektrotechnik und Informationstechnik
Degree: PhD
Year: 2009
Record ID: 1109916
Full text PDF: http://elib.uni-stuttgart.de/opus/volltexte/2009/4026/


Abstract

XML is becoming one of the main technological integredients of the Internet. It is now accepted as the standard for information exchange. XML-based data integration system, which enables sharing and cooperation with legacy data sources, arises as a more and more important data service provider on the web. These services can provide the users with a uniform interface to a multitude of data sources such as relational databases, XML files, text files, delimited files, Excel files, etc. Users can thus focus on what they want, rather than think about how to obtain the answers. Therefore, users do not have to carry on the tedious tasks such as finding the relevant data sources, interacting with each data source in isolation using the local interface and combining data from multiple data sources. Users are always expecting better query performance and data consistency from the data integration systems. This work proposes an approach to support constraints and triggers in the XML-based data integration system in order to optimize queries and to enforce data consistency. Constraints and triggers have long been recognized to be useful in semantic query optimization and data consistency enforcement in relational databases. This work first gives an approach to use constraints from the heterogeneous data sources to semantically optimize queries submitted to the XML-based data integration system. Different constraints from the data sources are first integrated into a uniform constraint model. Then the constraints in the uniform constraint model are stored in the constraint repository. Traditional semantic query optimization techniques in the relational database are analyzed and three of them are reused and applied by the semantic query optimizer for XML-based data integration system. Among them are detection of empty results, join elimination and predicate elimination. Performance is analyzed according to the data source type and the data volume. The semantic query optimizer works best when the data sources are non-relational, the data volume is huge and the execution cost is expected to be high. In order to make the XML-based data integration system fully equipped with data manipulation capabilities, programming frameworks which support update at the integration level are being developed. This work discusses how to realize update in the XML-based data integration system under the Service Data Objects programming framework. When the user is permitted to submit updates, it is necessary to guarantee data integrity and enforce active business logics in the data integration system. This work presents an approach by which active rules including integrity constraints are enforced by XQuery triggers. An XQuery trigger model in conformance to XQuery update model proposed by W3C is defined. How to define active rules and integrity constraints by XQuery triggers is discussed. Triggers and constraints are stored in the trigger repository. The architecture supporting XQuery trigger service in the XML-based data integration system is…