Distinct Computations Emerge From Compositional Curricula in In-Context Learning

📅 2025-06-16

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

This study investigates whether introducing structured subtask curricula in in-context learning can induce compositional computation capabilities in Transformers and improve zero-shot generalization and robustness on unseen compositional tasks. Method: Using modular exponentiation (double-exponentiation) and its single-exponentiation subtasks as a benchmark, we compare progressive subtask curricula against end-to-end direct training under identical context length constraints. Contribution/Results: We report the first empirical evidence that curriculum-based context presentation enables zero-shot reasoning on novel compositional tasks, reducing error rates by 42% relative to direct training. Representation analysis reveals the emergence of hierarchical, decomposable internal computation structures. Moreover, varying curriculum designs elicit diverse reasoning strategies, significantly enhancing contextual robustness. These findings demonstrate that curriculum design exerts a measurable, plastic influence on the intrinsic computational mechanisms of large language models.

Technology Category

Application Category

📝 Abstract

In-context learning (ICL) research often considers learning a function in-context through a uniform sample of input-output pairs. Here, we investigate how presenting a compositional subtask curriculum in context may alter the computations a transformer learns. We design a compositional algorithmic task based on the modular exponential-a double exponential task composed of two single exponential subtasks and train transformer models to learn the task in-context. We compare (a) models trained using an in-context curriculum consisting of single exponential subtasks and, (b) models trained directly on the double exponential task without such a curriculum. We show that models trained with a subtask curriculum can perform zero-shot inference on unseen compositional tasks and are more robust given the same context length. We study how the task and subtasks are represented across the two training regimes. We find that the models employ diverse strategies modulated by the specific curriculum design.

Problem

Research questions and friction points this paper is trying to address.

Investigates how compositional subtask curricula affect transformer computations

Compares curriculum-trained models vs direct training on complex tasks

Analyzes representation strategies across different curriculum designs

Innovation

Methods, ideas, or system contributions that make the work stand out.

Compositional subtask curriculum in-context learning

Transformer models trained on modular exponential task

Zero-shot inference on unseen compositional tasks

🔎 Similar Papers

The dynamic interplay between in-context and in-weight learning in humans and neural networks

2024-02-13Citations: 1

Authors to Follow