🤖 AI Summary
Existing AI safety benchmarks largely neglect fundamental human preferences grounded in biology—such as homeostatic regulation—and economics—such as scarcity, diminishing marginal returns, and shared resource constraints—resulting in structural gaps in alignment evaluation. To address this, we propose the first multi-objective, multi-agent AI safety benchmark framework that integrates principles from both biological homeostasis and economic rationality. Built upon PettingZoo, it comprises eight novel simulation environments systematically incorporating homeostatic regulation, shared-resource constraints, safety-aware penalties, and Pareto-optimal multi-objective trade-offs. Empirical evaluation exposes prevalent failure modes—including unbounded optimization, objective misgeneralization, safety neglect, and the “tragedy of the commons”—yielding a reproducible, quantitative diagnostic test suite. This work bridges a critical gap in AI safety evaluation by formally modeling foundational biological and economic constraints.
📝 Abstract
Developing safe, aligned agentic AI systems requires comprehensive empirical testing, yet many existing benchmarks neglect crucial themes aligned with biology and economics, both time-tested fundamental sciences describing our needs and preferences. To address this gap, the present work focuses on introducing biologically and economically motivated themes that have been neglected in current mainstream discussions on AI safety - namely a set of multi-objective, multi-agent alignment benchmarks that emphasize homeostasis for bounded and biological objectives, diminishing returns for unbounded, instrumental, and business objectives, sustainability principle, and resource sharing. We implemented eight main benchmark environments on the above themes, to illustrate key pitfalls and challenges in agentic AI-s, such as unboundedly maximizing a homeostatic objective, over-optimizing one objective at the expense of others, neglecting safety constraints, or depleting shared resources.