Scaling Laws for Moral Machine Judgment in Large Language Models

📅 2026-01-25

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

This study investigates whether the moral judgment capabilities of large language models in life-or-death ethical dilemmas improve predictably with scale and align with human preferences. Leveraging the Moral Machine framework, the authors systematically evaluate 75 models spanning parameter sizes from 0.27B to 1000B, employing power-law fitting and mixed-effects modeling to control for architectural and reasoning variables. They report the first evidence that the distance $D$ between model judgments and human preferences decreases with model scale $S$ following a power law ($D \propto S^{-0.10}$, $R^2 = 0.50$, $p < 0.001$), revealing that value-based judgments obey scaling laws. Furthermore, incorporating extended reasoning mechanisms yields an additional 16% improvement in alignment, and larger models exhibit lower judgment variance, indicating greater reliability.

Technology Category

Application Category

📝 Abstract

Autonomous systems increasingly require moral judgment capabilities, yet whether these capabilities scale predictably with model size remains unexplored. We systematically evaluate 75 large language model configurations (0.27B--1000B parameters) using the Moral Machine framework, measuring alignment with human preferences in life-death dilemmas. We observe a consistent power-law relationship with distance from human preferences ($D$) decreasing as $D \propto S^{-0.10\pm0.01}$ ($R^2=0.50$, $p<0.001$) where $S$ is model size. Mixed-effects models confirm this relationship persists after controlling for model family and reasoning capabilities. Extended reasoning models show significantly better alignment, with this effect being more pronounced in smaller models (size$\times$reasoning interaction: $p = 0.024$). The relationship holds across diverse architectures, while variance decreases at larger scales, indicating systematic emergence of more reliable moral judgment with computational scale. These findings extend scaling law research to value-based judgments and provide empirical foundations for artificial intelligence governance.

Problem

Research questions and friction points this paper is trying to address.

scaling laws

moral judgment

large language models

human alignment

AI ethics

Innovation

Methods, ideas, or system contributions that make the work stand out.

scaling laws

moral judgment

large language models

human alignment

Moral Machine

🔎 Similar Papers

A Survey on Moral Foundation Theory and Pre-Trained Language Models: Current Advances and Challenges

2024-09-20AI & SOCIETYCitations: 0

Authors to Follow