BriLLM: Brain-inspired Large Language Model

📅 2025-03-14
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses key limitations of conventional language models—poor interpretability, sequence-length dependency, and absence of brain-inspired cognitive mechanisms. We propose BriLLM, the first non-Transformer, fully graph-based, and intrinsically interpretable brain-like large language model. It represents language as a directed graph, where tokens are nodes and generation is driven by “neural signal flow” (SiFu), propagating along minimum-resistance paths—naturally supporting long-range dependencies and multimodal activation. BriLLM employs node-level manifold modeling and a biologically inspired recall mechanism, theoretically enabling infinite-order n-grams while decoupling parameter count from sequence length. The initial Chinese variant (4,000-token vocabulary, 32-dimensional nodes, 16-token prediction horizon) matches GPT-1’s performance, providing the first empirical validation of a graph-signal-dynamics–driven, brain-like generative paradigm.

Technology Category

Application Category

📝 Abstract
This paper reports the first brain-inspired large language model (BriLLM). This is a non-Transformer, non-GPT, non-traditional machine learning input-output controlled generative language model. The model is based on the Signal Fully-connected flowing (SiFu) definition on the directed graph in terms of the neural network, and has the interpretability of all nodes on the graph of the whole model, instead of the traditional machine learning model that only has limited interpretability at the input and output ends. In the language model scenario, the token is defined as a node in the graph. A randomly shaped or user-defined signal flow flows between nodes on the principle of"least resistance"along paths. The next token or node to be predicted or generated is the target of the signal flow. As a language model, BriLLM theoretically supports infinitely long $n$-gram models when the model size is independent of the input and predicted length of the model. The model's working signal flow provides the possibility of recall activation and innate multi-modal support similar to the cognitive patterns of the human brain. At present, we released the first BriLLM version in Chinese, with 4000 tokens, 32-dimensional node width, 16-token long sequence prediction ability, and language model prediction performance comparable to GPT-1. More computing power will help us explore the infinite possibilities depicted above.
Problem

Research questions and friction points this paper is trying to address.

Develops a brain-inspired language model with interpretable nodes.
Introduces a non-Transformer model using signal flow principles.
Supports infinitely long n-gram models with multi-modal capabilities.
Innovation

Methods, ideas, or system contributions that make the work stand out.

Non-Transformer, non-GPT generative language model
Signal Fully-connected flowing (SiFu) on directed graph
Supports infinitely long n-gram models theoretically
🔎 Similar Papers
No similar papers found.
H
Hai Zhao
Computer School, Shanghai Jiao Tong University
H
Hongqiu Wu
Computer School, Shanghai Jiao Tong University
Dongjie Yang
Dongjie Yang
Shanghai Jiao Tong University
Natural Language Processing
A
Anni Zou
Computer School, Shanghai Jiao Tong University
J
Jiale Hong
Computer School, Shanghai Jiao Tong University