CogDual: Enhancing Dual Cognition of LLMs via Reinforcement Learning with Implicit Rule-Based Rewards

📅 2025-07-22

📈 Citations: 0

✨ Influential: 0

career value

176K/year

🤖 AI Summary

Current role-playing language agents (RPLAs) predominantly rely on prompt engineering or supervised fine-tuning, neglecting the underlying cognitive mechanisms governing agent behavior. To address this, we propose CogDual, the first framework to instantiate a “cognition-first, response-second” paradigm, integrating external situational awareness with internal self-cognition and incorporating dual-process modeling inspired by cognitive psychology. Methodologically, we design a reinforcement learning framework with implicit rule-based rewards, enabling cognition-consistent optimization without human annotations; further, we jointly leverage prompt engineering, supervised fine-tuning, and reinforcement learning for end-to-end optimization in open-domain text generation. Evaluations on benchmarks—including CoSER, Cross-MR, and LifeChoice—demonstrate that CogDual significantly improves role behavioral consistency and contextual alignment, while exhibiting superior generalization over existing approaches.

Technology Category

Application Category

📝 Abstract

Role-Playing Language Agents (RPLAs) have emerged as a significant application direction for Large Language Models (LLMs). Existing approaches typically rely on prompt engineering or supervised fine-tuning to enable models to imitate character behaviors in specific scenarios, but often neglect the underlying emph{cognitive} mechanisms driving these behaviors. Inspired by cognitive psychology, we introduce extbf{CogDual}, a novel RPLA adopting a extit{cognize-then-respond } reasoning paradigm. By jointly modeling external situational awareness and internal self-awareness, CogDual generates responses with improved character consistency and contextual alignment. To further optimize the performance, we employ reinforcement learning with two general-purpose reward schemes designed for open-domain text generation. Extensive experiments on the CoSER benchmark, as well as Cross-MR and LifeChoice, demonstrate that CogDual consistently outperforms existing baselines and generalizes effectively across diverse role-playing tasks.

Problem

Research questions and friction points this paper is trying to address.

Enhancing dual cognition in LLMs via reinforcement learning

Improving character consistency and contextual alignment in RPLAs

Generalizing performance across diverse role-playing tasks

Innovation

Methods, ideas, or system contributions that make the work stand out.

Reinforcement learning with implicit rule-based rewards

Joint modeling of external and internal awareness

Cognize-then-respond reasoning paradigm

🔎 Similar Papers

Self-playing Adversarial Language Game Enhances LLM Reasoning