Jan-nano Technical Report

📅 2025-06-28

📈 Citations: 0

✨ Influential: 0

career value

200K/year

🤖 AI Summary

To address the inherent trade-off between capability and computational cost in large language model (LLM) deployment, this paper introduces Jan-nano—a lightweight 4B-parameter language model designed for efficient knowledge retrieval. Methodologically, it abandons conventional next-token prediction in supervised fine-tuning and instead proposes a task-driven, multi-stage RLVR (Reinforcement Learning for Verifiable Retrieval) framework, integrated with a Memory-Constrained Prompting (MCP) mechanism to natively support 128K context length. Jan-nano is fine-tuned from Qwen3-4B and achieves 83.2% accuracy on the SimpleQA benchmark. It enables efficient inference on a single consumer-grade GPU (e.g., RTX 4090). This work represents the first effort to deeply embed end-to-end reinforcement learning into the knowledge retrieval pipeline of a lightweight LLM, significantly lowering the deployment barrier for high-performance models.

Technology Category

Application Category

📝 Abstract

Most language models face a fundamental tradeoff where powerful capabilities require substantial computational resources. We shatter this constraint with Jan-nano, a 4B parameter language model that redefines efficiency through radical specialization: instead of trying to know everything, it masters the art of finding anything instantly. Fine-tuned from Qwen3-4B using our novel multi-stage RLVR system that completely eliminates reliance on next token prediction training (SFT), Jan-nano achieves 83.2% on SimpleQA benchmark with MCP integration while running on consumer hardware. With 128K context length, Jan-nano proves that intelligence isn't about scale, it's about strategy.

Problem

Research questions and friction points this paper is trying to address.

Overcoming computational resource constraints in language models

Achieving high efficiency without next token prediction training

Delivering strong performance on consumer hardware

Innovation

Methods, ideas, or system contributions that make the work stand out.

4B parameter model with radical specialization

Multi-stage RLVR system replaces SFT

128K context length on consumer hardware

🔎 Similar Papers

No similar papers found.