๐ค AI Summary
Large language models (LLMs) exhibit unreliable compositionality and poor formal compliance when orchestrating multi-step structured workflows. Method: This paper models workflows as typed probabilistic programs and formalizes workflow adaptation as a type-compliant probabilistic program learning problemโfirst of its kind. Our approach integrates parameter-efficient fine-tuning, deterministic logical constraints, and unnormalized joint distribution modeling, enabling end-to-end training via gradient-based optimization. We theoretically prove that the optimization bias vanishes asymptotically upon convergence, ensuring semantic correctness. Results: On MGSM-SymPy, our method boosts accuracy of a 27B model from 57.1% to 75.9%; for a 7B model on MGSM, accuracy rises sharply from 1.6% to 27.3%, substantially outperforming state-of-the-art prompt optimization techniques.
๐ Abstract
Reliably composing Large Language Models (LLMs) for complex, multi-step workflows remains a significant challenge. The dominant paradigm-optimizing discrete prompts in a pipeline-is notoriously brittle and struggles to enforce the formal compliance required for structured tasks. We introduce Type-Compliant Adaptation Cascades (TACs), a framework that recasts workflow adaptation as learning typed probabilistic programs. TACs treats the entire workflow, which is composed of parameter-efficiently adapted LLMs and deterministic logic, as an unnormalized joint distribution. This enables principled, gradient-based training even with latent intermediate structures. We provide theoretical justification for our tractable optimization objective, proving that the optimization bias vanishes as the model learns type compliance. Empirically, TACs significantly outperforms state-of-the-art prompt-optimization baselines. Gains are particularly pronounced on structured tasks, improving MGSM-SymPy from $57.1%$ to $75.9%$ for a 27B model, MGSM from $1.6%$ to $27.3%$ for a 7B model. TACs offers a robust and theoretically grounded paradigm for developing reliable, task-compliant LLM systems.