Jan-nano Technical Report

πŸ“… 2025-06-28
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
To address the inherent trade-off between capability and computational cost in large language model (LLM) deployment, this paper introduces Jan-nanoβ€”a lightweight 4B-parameter language model designed for efficient knowledge retrieval. Methodologically, it abandons conventional next-token prediction in supervised fine-tuning and instead proposes a task-driven, multi-stage RLVR (Reinforcement Learning for Verifiable Retrieval) framework, integrated with a Memory-Constrained Prompting (MCP) mechanism to natively support 128K context length. Jan-nano is fine-tuned from Qwen3-4B and achieves 83.2% accuracy on the SimpleQA benchmark. It enables efficient inference on a single consumer-grade GPU (e.g., RTX 4090). This work represents the first effort to deeply embed end-to-end reinforcement learning into the knowledge retrieval pipeline of a lightweight LLM, significantly lowering the deployment barrier for high-performance models.

Technology Category

Application Category

πŸ“ Abstract
Most language models face a fundamental tradeoff where powerful capabilities require substantial computational resources. We shatter this constraint with Jan-nano, a 4B parameter language model that redefines efficiency through radical specialization: instead of trying to know everything, it masters the art of finding anything instantly. Fine-tuned from Qwen3-4B using our novel multi-stage RLVR system that completely eliminates reliance on next token prediction training (SFT), Jan-nano achieves 83.2% on SimpleQA benchmark with MCP integration while running on consumer hardware. With 128K context length, Jan-nano proves that intelligence isn't about scale, it's about strategy.
Problem

Research questions and friction points this paper is trying to address.

Overcoming computational resource constraints in language models
Achieving high efficiency without next token prediction training
Delivering strong performance on consumer hardware
Innovation

Methods, ideas, or system contributions that make the work stand out.

4B parameter model with radical specialization
Multi-stage RLVR system replaces SFT
128K context length on consumer hardware
πŸ”Ž Similar Papers
No similar papers found.