NeuroGen: Neural Network Parameter Generation via Large Language Models

📅 2025-05-18

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

This work investigates the feasibility of directly generating neural network (NN) parameters using large language models (LLMs), bypassing conventional backpropagation and iterative optimization. Method: We propose a two-stage paradigm: (1) parameter-space pretraining of an LLM on NN checkpoints to model weight distributions, and (2) context-enhanced instruction tuning guided by task descriptions and architecture-aware prompts. Contribution/Results: We present the first framework enabling LLMs to conditionally generate deployable, task-specific NN parameters—integrating parameter-space knowledge injection with task-aware generation. Evaluated across diverse vision and language tasks, the generated parameters achieve 70–90% of baseline model performance. This demonstrates the viability of LLMs as “neural network compilers” and establishes a novel pathway for synergistic deployment of LLMs with lightweight NNs.

Technology Category

Application Category

📝 Abstract

Acquiring the parameters of neural networks (NNs) has been one of the most important problems in machine learning since the inception of NNs. Traditional approaches, such as backpropagation and forward-only optimization, acquire parameters via iterative data fitting to gradually optimize them. This paper aims to explore the feasibility of a new direction: acquiring NN parameters via large language model generation. We propose NeuroGen, a generalized and easy-to-implement two-stage approach for NN parameter generation conditioned on descriptions of the data, task, and network architecture. Stage one is Parameter Reference Knowledge Injection, where LLMs are pretrained on NN checkpoints to build foundational understanding of parameter space, whereas stage two is Context-Enhanced Instruction Tuning, enabling LLMs to adapt to specific tasks through enriched, task-aware prompts. Experimental results demonstrate that NeuroGen effectively generates usable NN parameters. Our findings highlight the feasibility of LLM-based NN parameter generation and suggest a promising new paradigm where LLMs and lightweight NNs can coexist synergistically

Problem

Research questions and friction points this paper is trying to address.

Exploring NN parameter generation via LLMs instead of traditional methods

Proposing NeuroGen for task-aware NN parameter generation

Demonstrating feasibility of LLM-based NN parameter synthesis

Innovation

Methods, ideas, or system contributions that make the work stand out.

LLM-based NN parameter generation approach

Two-stage training with knowledge injection

Task-aware prompts for parameter adaptation

🔎 Similar Papers

No similar papers found.

Authors to Follow