The Atlas of Data Science Research
The origin and evolution of Data Science (DS) have been a subject of ongoing debate, with perspectives varying across disciplines. Understanding the development of this field requires a data-driven approach that systematically analyzes the scientific literature and provides a practical method for its exploration. In this paper, we present the “Atlas of Data Science Research” (DS-Atlas), an interactive visualization tool designed to study the landscape of the DS field. The DS-Atlas is built on a dataset of approximately 1.3 million scientific publications from the Elsevier Scopus database, leveraging Natural Language Processing, Large Language Models, and dimensionality reduction techniques to generate a semantic representation of the DS research. The DS-Atlas provides interactive operations to explore the dataset by allowing users to focus on specific areas, filter by keywords and/or time periods, and uncover thematic connections and research trends. Examples of concrete tasks that can be addressed by DS-Atlas are discussed to show how the proposed solution can support scholars in the data-driven analysis of the data science literature. As a further DS-Atlas contribution, the paper illustrates an analysis of the Data Science discipline in terms of geographical distribution of influential authors, institutions, and journal in the field. The DS-Atlas is publicly available online for exploration and testing.
PICASCIA Sergio;
MONTANELLI Stefano;
SALINI Silvia;
VERZILLO Stefano;
2025-10-17
IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
JRC140960
2169-3536 (online),
https://ieeexplore.ieee.org/document/11194242,
https://publications.jrc.ec.europa.eu/repository/handle/JRC140960,
10.1109/ACCESS.2025.3618442 (online),
Additional supporting files
| File name | Description | File type | |