Abstract
The goal of a data integration system is to allow users to query diverse information sources through a schema that is familiar to them. However, there may be many different users who may have dif- ferent preferred schemas, and the data may be stored in data sources which use still other schemas. To integrate data, mapping rules must be defined to map entities of the data sources to entities of the users’ schemas. In large information systems with many data sources which serve sophisticated applications, there can be many such mapping rules and they can be complex. The purpose of this paper is to study the per- formance of alternative query processing techniques for data integration systems with many complex mapping rules. A new approach, mapping data to queries (MDQ), is presented. Through extensive performance experiments, it is shown that this approach performs well for complex mapping rules and queries, and scales significantly better with the num- ber of rules than the state of the art, which is based on query rewrite. In fact, the performance is close to that of an ideal system in which there is only a single schema used by all sources and queries. Mehr anzeigen
Persistenter Link
https://doi.org/10.3929/ethz-a-006835897Publikationsstatus
publishedZeitschrift / Serie
Technical report / [ETH, Department of Computer ScienceBand
Verlag
Swiss Federal Institute of TechnologyThema
INFORMATION STORAGE + INFORMATION RETRIEVAL (INFORMATION SYSTEMS); INFORMATIONSSPEICHERUNG + INFORMATIONSGEWINNUNG (INFORMATIONSSYSTEME); SPECIAL PROGRAMMING METHODS; ABFRAGEN (INFORMATIONSSYSTEME); SPEZIELLE PROGRAMMIERMETHODEN; QUERIES (INFORMATION SYSTEMS)Organisationseinheit
02150 - Dep. Informatik / Dep. of Computer Science
ETH Bibliographie
yes
Altmetrics