Studying number theory with deep learning: a case study with the M""obius and squarefree indicator functions

📅 2025-02-14

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

This study investigates the capacity of small Transformer models to learn number-theoretic functions—specifically, the Möbius function μ(n) and the squarefree indicator μ²(n). Method: We employ a supervised learning framework trained on integer sequences and apply iterative interpretability analysis—including linear probe classifiers and feature visualization—to uncover internal computational mechanisms. Contribution/Results: We discover, for the first time, that the model implicitly constructs prime factorization structure during training; its decision logic is invertible and provably aligns with classical number-theoretic principles. Empirically, the model significantly outperforms random baselines on unseen integers, and its generalization remains robust as the underlying number-theoretic structure increases in complexity. This work provides the first empirically verifiable and interpretable demonstration of neural networks encoding elementary number theory, establishing a rigorous bridge between deep learning and analytic number theory.

Technology Category

Application Category

📝 Abstract

Building on work of Charton, we train small transformer models to calculate the M""obius function $mu(n)$ and the squarefree indicator function $mu^2(n)$. The models attain nontrivial predictive power. We then iteratively train additional models to understand how the model functions, ultimately finding a theoretical explanation.

Problem

Research questions and friction points this paper is trying to address.

Deep learning in number theory

Transformer models for Möbius function

Theoretical explanation of model functionality

Innovation

Methods, ideas, or system contributions that make the work stand out.

Deep learning applied to number theory

Transformer models for mathematical functions

Iterative training for theoretical insights

🔎 Similar Papers

No similar papers found.

Authors to Follow