Acquisition and Declarative Analytical Processing of Spatio-Temporal Observation Data

A myriad of data acquisition devices is observing every day more variables and generating a vast amount of data in almost every application domain. Environmental observation data is an essential portion of such generated data, whose spatio-temporal nature has posed interesting challenges in the area of Environmental Observation Data Management Systems. Two features are common to all these systems: spatio-temporal observations and heterogeneity. In the context of this Thesis, the Observations and Measurements (O&M) conceptual schema was adopted as the theoretical framework for the definition of the concept of observation. Heterogeneity specifically concerns the data acquisition part of the aforementioned systems, which need to access data produced by heterogeneous sensing following different software/hardware specifications that are accessed through several communication protocols. A major challenge is to provide the required flexibility to enable data acquisition from heterogeneous sensing devices and data dissemination through heterogeneous end-user applications. The system must provide simple and straightforward mechanisms for the incorporation of the following components: 1) new in-situ sensing devices, 2) new data dissemination services, and 3) different persistent data storage technologies. Focusing on observation data management, a system must provide the following general functionalities to effectively manage observation data: 1) management of conventional Entity/Relationship data related to non-observed properties of entities, 2) management of sampled data over temporal, spatial (1D and 2D) and spatio-temporal domains, 3) Support for observation data semantics, and 4) efficient implementation for large scale shared-nothing distributed hardware architectures. Moreover, the INSPIRE Directive of the European Union encourages the creation of a Spatial Data Infrastructure (SDI) to ensure the interoperability of spatial information systems in Europe. The application of INSPIRE in the Spanish legislative system forces public administrations to make their geographic data available through SDI services. Therefore, the new enriched geographical knowledge allows for the appearance of many applications in different areas of knowledge that require spatial analysis capabilities. Therefore, the main objective of this Thesis is the design and implementation of a generic framework for spatio-temporal observation data acquisition and declarative analytical processing. This overall goal can be divided into three independent specific objectives: • Design and implementation of a generic observation data acquisition and dissemination server. • Design of a framework for declarative spatio-temporal analysis in very large spatio-temporal data warehouses. • Efficient implementation of spatio-temporal on-line analytical processing in large scale distributed shared-nothing hardware architectures. The main contributions of this Thesis may be summarized as follows: • Generalization of a data acquisition and dissemination server, with great applicability in many scientific and industrial domains, providing flexibility in the incorporation of different technologies for data acquisition, data persistence and data dissemination. • Definition of a new hybrid logical-functional paradigm to formalize a novel data model for the integrated management of entity and sampled data. • Definition of a novel spatio-temporal declarative data analysis language for the previous data model.

keywords: Spatio-temporal Observation Data, Declarative Analytical Processing, Integrated Entity-Raster Data Managenent, Distributed Column Oriented Processing