🤖 AI Summary
This paper addresses the joint challenges of enhancing agent capabilities, long-context reasoning, and throughput optimization in large language models. To this end, we propose OpenAgent—a family of open-source large models designed for efficiency, openness, and multi-scenario adaptability. Our method introduces a novel Mamba-Transformer hybrid Mixture-of-Experts (MoE) architecture: (i) LatentMoE, a first-of-its-kind mechanism to improve expert quality via latent routing; (ii) MTP (Multi-Token Prediction) layers to accelerate autoregressive generation; and (iii) fine-grained inference budget control. The series comprises three variants—Nano (optimal accuracy-cost trade-off), Super (high-concurrency collaborative agents), and Ultra (state-of-the-art inference performance)—all supporting unified 1M-token context windows and multi-environment reinforcement fine-tuning. We fully open-source model weights, training code, and datasets, and integrate NVFP4 quantization with end-to-end training pipelines.
📝 Abstract
We introduce the Nemotron 3 family of models - Nano, Super, and Ultra. These models deliver strong agentic, reasoning, and conversational capabilities. The Nemotron 3 family uses a Mixture-of-Experts hybrid Mamba-Transformer architecture to provide best-in-class throughput and context lengths of up to 1M tokens. Super and Ultra models are trained with NVFP4 and incorporate LatentMoE, a novel approach that improves model quality. The two larger models also include MTP layers for faster text generation. All Nemotron 3 models are post-trained using multi-environment reinforcement learning enabling reasoning, multi-step tool use, and support granular reasoning budget control. Nano, the smallest model, outperforms comparable models in accuracy while remaining extremely cost-efficient for inference. Super is optimized for collaborative agents and high-volume workloads such as IT ticket automation. Ultra, the largest model, provides state-of-the-art accuracy and reasoning performance. Nano is released together with its technical report and this white paper, while Super and Ultra will follow in the coming months. We will openly release the model weights, pre- and post-training software, recipes, and all data for which we hold redistribution rights.