On Declarative Data-Parallel Computation: Models, Languages and Semantics If we put under analysis the plethora of large-scale data-processing tools avail- able nowadays, we can recognize two main approaches: Although there has been some work trying to bring together the two worlds, these works focus mainly on exporting languages and interfaces — i.

We advocate that, instead, a declarative imperative approach should be attempted: The goal of this thesis is then to carry out a first step in this direction. More concretely, we developed a new synchronous computational model for relational distributed parallel data-processing, leveraging on previous works on relational transducers and transducer networks.

In this way, we may add sharable semantics to legacy data sources. Moreover, annotated labels are a powerful means in order to discover Lexical Relationships among structured and semi-structured data sources.

Original methods to automatically normalize schema labels and extract lexical relationships have been developed and their affectiveness for automatic schema matching shown.

Query Processing and Data Quality. First this thesis proposed new techniques that consider the optimization of the full outerjoin operation, which is used in data integration systems for data fusion.

Then this thesis demonstrated how to achieve Quality-Driven Query Processing, where quality constraints specified in Data Quality Aware Queries are used to perform query optimization.

Architectures and Applications to Real Domains This thesis focuses on Semantic Data Integration Systems, with particular attention to mediator system approaches, to perform data and service integration. One of the topics of this thesis is the application of MOMIS to the bioinformatics domain to integrate different public databases to create an ontology of molecular and phenotypic cereals data.

However, the main contribution of this thesis is a semantic approach to perform aggregated search of data and services. In particular, I describe a technique that, on the basis of an ontological representation of data and services related to a domain, supports the translation of a data query into a service discovery process, that has also been implemented as a MOMIS extension.

This approach can be described as a Service as Data approach, as opposed to Data as a Service approaches. In the Service as Data approach, informative services are considered as a kind of source to be integrated with other data sources, to enhance the domain knowledge provided by a Global Schema of data.

Finally, new technologies and approaches for data integration have been investigated, in particular distributed architecture, with the objective to provide a scalable architecture for data integration. An integration framework in a distributed environment is presented that allows realizing a data integration process on the cloud.

La tesi illustra come l'annotazione lessicale sia un elemento cruciale in ambito di integrazione dati. Grazie all'annotazione lessicale, vengono scoperte nuove relazioni tra gli elementi di uno schema o tra elementi di schemi diversi. Diversi metodi per eseguire automaticamente l'annotazione delle sorgenti dati vengono descritti e valutati in diversi scenari.

Sono presentati alcuni esperimenti di applicazione dell'annotazione lessicale ai risultati di un matcher. Infine, viene introdotto l'approccio all'annotazione probabilistica e viene illustrata la sua applicazione nei processi di integrazione dinamici.

This thesis investigates the issue of Query Management in Data Integration Systems, taking into account several problems that have to be faced during the query processing phase. The achieved goals of the thesis have been the study, analysis and proposal of techniques for effectively querying Data Integration Systems.

This thesis investigates the issue of Query Management in Data Integration Systems, taking into account several problems that have to be faced during the query processing phase. The achieved goals of the thesis have been the study, analysis and proposal of techniques for effectively querying Data Integration Systems.

The proposed techniques have been developed in the MOMIS Query Manager prototype to enable users to query an integrated schema, and to provide users a consistent and concise unified answer. A new kind of metadata that offers a synthesized view of an attributes values, the relevant values, has been defined and the effectiveness of such metadata for creating or refining a search query in a knowledge base is demonstrated by means of experimental results.

