Seeking Human Security Consensus: A Unified Value Scale for Generative AI Value Safety

📅 2026-01-14

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

This study addresses the lack of a unified consensus on value safety in generative AI, which poses significant ethical and value alignment risks. From a full lifecycle perspective, the authors construct a taxonomy of value safety risks, establish an incident repository, and—grounded in grounded theory—propose the first internationally inclusive, resilient, and operational Generative AI Value Safety framework (GVS-Scale) along with its accompanying evaluation benchmark (GVS-Bench). Systematic evaluations on mainstream text generation models reveal substantial disparities in their value safety performance, empirically confirming the fragmented state of current value alignment efforts. This work provides both empirical evidence and practical tools to support the development of a shared, quantifiable foundation for AI safety.

Technology Category

Application Category

📝 Abstract

The rapid development of generative AI has brought value- and ethics-related risks to the forefront, making value safety a critical concern while a unified consensus remains lacking. In this work, we propose an internationally inclusive and resilient unified value framework, the GenAI Value Safety Scale (GVS-Scale): Grounded in a lifecycle-oriented perspective, we develop a taxonomy of GenAI value safety risks and construct the GenAI Value Safety Incident Repository (GVSIR), and further derive the GVS-Scale through grounded theory and operationalize it via the GenAI Value Safety Benchmark (GVS-Bench). Experiments on mainstream text generation models reveal substantial variation in value safety performance across models and value categories, indicating uneven and fragmented value alignment in current systems. Our findings highlight the importance of establishing shared safety foundations through dialogue and advancing technical safety mechanisms beyond reactive constraints toward more flexible approaches. Data and evaluation guidelines are available at https://github.com/acl2026/GVS-Bench. This paper includes examples that may be offensive or harmful.

Problem

Research questions and friction points this paper is trying to address.

value safety

generative AI

human security

value alignment

ethical risks

Innovation

Methods, ideas, or system contributions that make the work stand out.

Generative AI

Value Safety

Unified Framework

GVS-Scale

Safety Benchmark

🔎 Similar Papers

Safetywashing: Do AI Safety Benchmarks Actually Measure Safety Progress?

2024-07-31arXiv.orgCitations: 5

Measuring Human and AI Values based on Generative Psychometrics with Large Language Models

2024-09-18arXiv.orgCitations: 1

Authors to Follow