AI is an area of strategic importance with potential to be a key driver of economic development and with a wide range of potential social
implications. In order to assess present and future impact, there is a need to analyse what AI can (and will) achieve. But, what is AI capable of? This question is as crucial as elusive, as AI is progressing in ways that are open-ended about the techniques and resources AI can operate with. The truth is that whenever a task is solved, researchers find increasingly challenging to extrapolate whether this task can be reproduced, even when only a few things change: the data, the domain knowledge, the level of uncertainty, the (hyper)parameters, the techniques, the team, the compute, etc. In the end, we would like to infer whether a good result (or a breakthrough) in task A transfers to a similar good result in task B. This extrapolation is precisely what the notion of capability, borrowed from psychology, tries to answer. However, we lack the tools, and the data, to do similarly in AI. Benchmarks, competitions and challenges are behind much of the recent progress in AI, especially in machine learning (ML) [10], but the dynamics of rushing breakthroughs at the expense of massive data, compute, specialisation, etc., has led to a more complex AI landscape, in terms of what can be achieved and how. As a result, policy makers and other stakeholders have no way of assessing what AI systems can do today and in the future. This does not mean that we must disregard or understate the valuable information that is provided by a plethora of benchmarks. On the contrary, the analysis of the progress of AI must be based on data-grounded evidence, relying on finding and testing hypotheses through the computational analysis of big amounts of shared data [6], using open data science tools [11]. But this analysis must be abstracted from tasks to capabilities, for the purposes of integration3 and evaluation [8]. In this paper, we identify a series of problems to track and understand what AI is capable of, surveying some previous initiatives. We present the AIcollaboratory, a data-driven framework to collect
and explore data about AI results, progress and ultimately capabilities, being developed in the context of AI WATCH, the European
Commission (EC) knowledge service to monitor the development, uptake and impact of AI in Europe4. We close the paper with some
challenges for the community emerging around the collaboratory.
MARTINEZ PLUMED Fernando;
HERNÁNDEZ-ORALLO José;
GOMEZ GUTIERREZ Emilia;
2020-11-27
I O S PRESS
JRC121970
0922-6389 (online),
http://ebooks.iospress.nl/volumearticle/55246,
https://publications.jrc.ec.europa.eu/repository/handle/JRC121970,
10.3233/FAIA200451 (online),