The project involves the creation of a digital platform that integrates, organizes, and makes available data generated by the institution’s various research groups in order to overcome the current challenges of access, interoperability, and reuse of scientific information, promoting a modern and dynamic infrastructure for large-scale data storage, cataloguing, and analysis.
Currently, the data produced by ITV DS is spread across different formats and repositories, such as spreadsheets, reports, scientific articles, and local databases. This fragmentation compromises the potential for scientific discovery and the generation of strategic knowledge. Inspired by the FAIR principles (Findable, Accessible, Interoperable, Reusable), the project proposes the development of an institutional data lake that allows the ingestion of raw data in its native formats, with subsequent structuring according to analytical needs.
The platform will be able to integrate heterogeneous data, such as information on land use and cover, climate, biodiversity, genetics, geology, hydrology, and socioeconomics, with geolocation as the common axis. It will also allow annotation with metadata, dynamic updates, and controlled access, enabling advanced analyses using artificial intelligence, predictive modeling, and interactive visualization. The DataLakeDS architecture will be based on a data zoning structure, with specific stages for quality assessment, transformation, governance, usefulness, and exploitation, according to reference models consolidated in the scientific literature.








