A Versatile Data-Intensive Computing Platform for Information Retrieval from Big Geospatial Data
The increasing amount of free and open geospatial data of interest to major societal questions calls for the development of innovative data-intensive computing platforms for the efficient and effective extraction of information from these data. This paper proposes a versatile petabyte-scale platform based on commodity hardware and equipped with open-source software for the operating system, the distributed file system, and the task scheduler for batch processing as well as the containerization of user specific applications. Interactive visualization and processing based on deferred processing are also proposed. The versatility of the proposed platform is illustrated with a series of applications together with their performance metrics.
SOILLE Pierre;
BURGER Armin;
DE MARCHI Davide;
KEMPENEERS Pieter;
RODRIGUEZ ASERETTO Roque Dario;
SYRRIS Vasileios;
VASILEV Veselin;
2018-01-12
ELSEVIER SCIENCE BV
JRC105787
0167-739X,
https://www.sciencedirect.com/science/article/pii/S0167739X1730078X,
https://publications.jrc.ec.europa.eu/repository/handle/JRC105787,
10.1016/j.future.2017.11.007,
Additional supporting files
| File name | Description | File type | |