🤖 AI Summary
This paper addresses content safety and regulatory compliance risks posed by generative AI in financial services, proposing the first regulatory-compliant, finance-specific AI content risk taxonomy. Methodologically, it integrates regulatory frameworks, domain-specific business scenarios, and operational risk dimensions, conducting empirical research via risk taxonomy modeling, red-teaming, evaluation of open-source content guardrails, and cross-stakeholder impact analysis. Results reveal that mainstream open-source content moderation tools achieve less than 20% coverage for finance-specific risks, underscoring the inadequacy of generic safety solutions in this domain. The primary contributions are: (1) a scalable, interpretable, and regulator-aligned risk taxonomy; and (2) empirical validation of the necessity and urgency of domain-adapted safety mechanisms. This work establishes both theoretical foundations and actionable pathways for responsible AI governance in financial services.
📝 Abstract
To responsibly develop Generative AI (GenAI) products, it is critical to define the scope of acceptable inputs and outputs. What constitutes a"safe"response is an actively debated question. Academic work puts an outsized focus on evaluating models by themselves for general purpose aspects such as toxicity, bias, and fairness, especially in conversational applications being used by a broad audience. In contrast, less focus is put on considering sociotechnical systems in specialized domains. Yet, those specialized systems can be subject to extensive and well-understood legal and regulatory scrutiny. These product-specific considerations need to be set in industry-specific laws, regulations, and corporate governance requirements. In this paper, we aim to highlight AI content safety considerations specific to the financial services domain and outline an associated AI content risk taxonomy. We compare this taxonomy to existing work in this space and discuss implications of risk category violations on various stakeholders. We evaluate how existing open-source technical guardrail solutions cover this taxonomy by assessing them on data collected via red-teaming activities. Our results demonstrate that these guardrails fail to detect most of the content risks we discuss.