🤖 AI Summary
This study addresses the challenge of deploying AI for socially impactful decision-making under small-data conditions—particularly where marginalized populations are underrepresented and wearable health technologies exhibit limited real-world efficacy. We propose a novel “knowledge-driven + data-driven” hybrid framework that systematically integrates statistical modeling priors with machine learning techniques, establishing an interdisciplinary paradigm for small-data AI applications. Through empirical case studies across policy design and digital health domains, we delineate the practical feasibility boundaries of current small-data methods and identify critical bottlenecks—including data scarcity, model interpretability, and fairness constraints. Results demonstrate that knowledge embedding significantly enhances generalization in low-sample regimes. Beyond empirical validation, the work articulates a forward-looking research agenda centered on model interpretability, human-AI collaboration mechanisms, and fairness-aware evaluation protocols. Collectively, this research provides both theoretical foundations and actionable guidelines for deploying high-value, trustworthy AI in data-scarce societal contexts.
📝 Abstract
The emergence of breakthrough artificial intelligence (AI) techniques has led to a renewed focus on how small data settings, i.e., settings with limited information, can benefit from such developments. This includes societal issues such as how best to include under-represented groups in data-driven policy and decision making, or the health benefits of assistive technologies such as wearables. We provide a conceptual overview, in particular contrasting small data with big data, and identify common themes from exemplary case studies and application areas. Potential solutions are described in a more detailed technical overview of current data analysis and modelling techniques, highlighting contributions from different disciplines, such as knowledge-driven modelling from statistics and data-driven modelling from computer science. By linking application settings, conceptual contributions and specific techniques, we highlight what is already feasible and suggest what an agenda for fully leveraging small data might look like.