Predictable Artificial Intelligence

📅 2023-10-09
🏛️ arXiv.org
📈 Citations: 1
✨ Influential: 0
📄 PDF
🤖 AI Summary
Current and future AI systems suffer from insufficient predictability, undermining trust, obscuring accountability, increasing loss-of-control risks, and raising safety concerns. Method: This work introduces “Predictable AI” as a novel paradigm that prioritizes predictability over raw performance—establishing it as the foundational prerequisite for trustworthiness, controllability, alignment, and safety. We formally define AI predictability for the first time, decomposing its core components, trade-off mechanisms, and the predictability–effectiveness boundary. We specify prediction targets, candidate predictors, and evaluation dimensions, thereby delineating a distinct research direction independent of conventional AI evaluation frameworks. Through formal modeling, conceptual analysis, and cross-scenario reasoning, we construct a multi-dimensional trade-off framework and a foundational theoretical system. Contribution: The work provides original theoretical foundations and practical guidance for developing AI systems that are both predictable and effective, clarifying technical pathways and interdisciplinary connections.
📝 Abstract
We introduce the fundamental ideas and challenges of Predictable AI, a nascent research area that explores the ways in which we can anticipate key validity indicators (e.g., performance, safety) of present and future AI ecosystems. We argue that achieving predictability is crucial for fostering trust, liability, control, alignment and safety of AI ecosystems, and thus should be prioritised over performance. We formally characterise predictability, explore its most relevant components, illustrate what can be predicted, describe alternative candidates for predictors, as well as the trade-offs between maximising validity and predictability. To illustrate these concepts, we bring an array of illustrative examples covering diverse ecosystem configurations. Predictable AI is related to other areas of technical and non-technical AI research, but have distinctive questions, hypotheses, techniques and challenges. This paper aims to elucidate them, calls for identifying paths towards a landscape of predictably valid AI systems and outlines the potential impact of this emergent field.
Problem

Research questions and friction points this paper is trying to address.

AI predictability
human trust
AI safety
Innovation

Methods, ideas, or system contributions that make the work stand out.

Predictable AI
Trust and Safety
Predictive Accuracy
🔎 Similar Papers
No similar papers found.
L
Lexin Zhou
Department of Computer Science and Technology, University of Cambridge; Valencian Research Institute of Artificial Intelligence, Universitat Politècnica de València
P
Pablo A. Moreno-Casares
FAR.ai
Fernando MartĂ­nez-Plumed
Fernando MartĂ­nez-Plumed
VRAIN, Valencian Research Institute for Artificial Intelligence, Universitat Politecnica de Valencia
Artificial IntelligenceMachine LearningAI evaluationItem Response Theory
John Burden
John Burden
University of Cambridge
Reinforcement LearningArtificial IntelligenceLong-term AI SafetyAI Evaluation
Ryan Burnell
Ryan Burnell
Google DeepMind
Artificial intelligenceAI evaluationExperimental Psychology
L
Lucy G. Cheke
Valencian Research Institute of Artificial Intelligence, Universitat Politècnica de València; Department of Psychology, University of Cambridge
C
Cesar Ferri
Valencian Research Institute of Artificial Intelligence, Universitat Politècnica de València
A
Alexandru Marcoci
Centre for the Study of Existential Risk, University of Cambridge
B
Behzad Mehrbakhsh
Valencian Research Institute of Artificial Intelligence, Universitat Politècnica de València
Y
Yael Moros-Daval
Valencian Research Institute of Artificial Intelligence, Universitat Politècnica de València
S
Seán Ó hÉigeartaigh
Leverhulme Centre for the Future of Intelligence, University of Cambridge; Centre for the Study of Existential Risk, University of Cambridge
Danaja Rutar
Danaja Rutar
University of Cambridge
Learningdevelopmentrepresentationscommon sense in machines
W
Wout Schellaert
Valencian Research Institute of Artificial Intelligence, Universitat Politècnica de València
Konstantinos Voudouris
Konstantinos Voudouris
Postdoctoral Research Scientist, Helmholtz Munich
AI EvaluationCognitive SciencePhilosophy of ScienceLinguistics
JosĂŠ HernĂĄndez-Orallo
JosĂŠ HernĂĄndez-Orallo
University of Cambridge, VRAIN-UPV
Artificial IntelligenceData ScienceIntelligenceAI EvaluationAI Safety