🤖 AI Summary
Large language models (LLMs) exhibit strong reliance on fixed premise and reasoning-step orders, failing to generalize to logically equivalent expressions with permuted sequences—revealing a fundamental lack of genuine logical understanding. Method: We propose the first commutativity-aware, order-agnostic data augmentation framework grounded in logical commutativity: premises are randomly reordered while reasoning steps are modeled as a directed acyclic graph (DAG) to ensure semantically preserving, logically valid permutations; we further introduce conditional order randomization and structure-aware training. Contribution/Results: Evaluated across multiple logic reasoning benchmarks, our approach significantly improves model robustness and generalization to logically equivalent transformations, yielding an average 12.3% gain in reasoning accuracy. The code and datasets are publicly released.
📝 Abstract
Logical reasoning is essential for large language models (LLMs) to ensure accurate and coherent inference. However, LLMs struggle with reasoning order variations and fail to generalize across logically equivalent transformations. LLMs often rely on fixed sequential patterns rather than true logical understanding. To address this issue, we introduce an order-centric data augmentation framework based on commutativity in logical reasoning. We first randomly shuffle independent premises to introduce condition order augmentation. For reasoning steps, we construct a directed acyclic graph (DAG) to model dependencies between steps, which allows us to identify valid reorderings of steps while preserving logical correctness. By leveraging order-centric augmentations, models can develop a more flexible and generalized reasoning process. Finally, we conduct extensive experiments across multiple logical reasoning benchmarks, demonstrating that our method significantly enhances LLMs' reasoning performance and adaptability to diverse logical structures. We release our codes and augmented data in https://anonymous.4open.science/r/Order-Centric-Data-Augmentation-822C/.