Auditing the Ethical Logic of Generative AI Models

📅 2025-04-24

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

Current evaluations of generative AI’s ethical reasoning capabilities in high-stakes domains lack quantifiable, multidimensional benchmarks. Method: We propose the first quantifiable five-dimensional ethical logic auditing framework—assessing analytical quality, ethical breadth, explanatory depth, consistency, and decisiveness—grounded in applied ethics theory and higher-order thinking. We design diverse moral dilemma prompts, integrate multi-batch prompting, chain-of-thought (CoT) augmentation, and comparative inference-optimized model evaluation. Results: Empirical benchmarking across seven mainstream LLMs reveals convergent ethical judgments but substantial variation in explanatory rigor and moral priority weighting. CoT and inference optimization consistently improve scores across all five dimensions. This work establishes a methodological foundation and practical paradigm for making LLMs’ ethical capabilities measurable, comparable, and improvable.

Technology Category

Application Category

📝 Abstract

As generative AI models become increasingly integrated into high-stakes domains, the need for robust methods to evaluate their ethical reasoning becomes increasingly important. This paper introduces a five-dimensional audit model -- assessing Analytic Quality, Breadth of Ethical Considerations, Depth of Explanation, Consistency, and Decisiveness -- to evaluate the ethical logic of leading large language models (LLMs). Drawing on traditions from applied ethics and higher-order thinking, we present a multi-battery prompt approach, including novel ethical dilemmas, to probe the models' reasoning across diverse contexts. We benchmark seven major LLMs finding that while models generally converge on ethical decisions, they vary in explanatory rigor and moral prioritization. Chain-of-Thought prompting and reasoning-optimized models significantly enhance performance on our audit metrics. This study introduces a scalable methodology for ethical benchmarking of AI systems and highlights the potential for AI to complement human moral reasoning in complex decision-making contexts.

Problem

Research questions and friction points this paper is trying to address.

Evaluating ethical reasoning in generative AI models

Introducing a five-dimensional audit model for LLMs

Benchmarking LLMs on ethical decision-making and explanations

Innovation

Methods, ideas, or system contributions that make the work stand out.

Five-dimensional audit model for ethical evaluation

Multi-battery prompt approach with ethical dilemmas

Chain-of-Thought prompting enhances ethical reasoning

🔎 Similar Papers

No similar papers found.

Authors to Follow