BeamLLM: Vision-Empowered mmWave Beam Prediction with Large Language Models

📅 2025-03-13

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

To address the challenge of balancing high training overhead and low latency in millimeter-wave (mmWave) beam prediction, this paper pioneers the integration of large language models (LLMs) into beam prediction. We propose a vision–semantic collaborative reprogramming framework: RGB images capture user equipment spatial positions, while prompt-based reprogramming enables cross-modal alignment between visual-temporal features and the LLM’s semantic space. This mechanism significantly enhances few-shot generalization and robustness in dynamic environments. Evaluated on a real-world vehicle-infrastructure cooperative scenario, our method achieves top-1/top-3 beam prediction accuracy of 61.01% and 97.39%, respectively, with only 12.56% and 5.55% accuracy degradation over a 10-step temporal horizon. These results demonstrate the effectiveness of our approach under stringent low-overhead and low-latency constraints.

Technology Category

Application Category

📝 Abstract

In this paper, we propose BeamLLM, a vision-aided millimeter-wave (mmWave) beam prediction framework leveraging large language models (LLMs) to address the challenges of high training overhead and latency in mmWave communication systems. By combining computer vision (CV) with LLMs' cross-modal reasoning capabilities, the framework extracts user equipment (UE) positional features from RGB images and aligns visual-temporal features with LLMs' semantic space through reprogramming techniques. Evaluated on a realistic vehicle-to-infrastructure (V2I) scenario, the proposed method achieves 61.01% top-1 accuracy and 97.39% top-3 accuracy in standard prediction tasks, significantly outperforming traditional deep learning models. In few-shot prediction scenarios, the performance degradation is limited to 12.56% (top-1) and 5.55% (top-3) from time sample 1 to 10, demonstrating superior prediction capability.

Problem

Research questions and friction points this paper is trying to address.

Reduces high training overhead in mmWave systems

Decreases latency in mmWave communication systems

Enhances beam prediction accuracy using vision and LLMs

Innovation

Methods, ideas, or system contributions that make the work stand out.

Combines computer vision with large language models

Extracts positional features from RGB images

Aligns visual-temporal features using reprogramming techniques

🔎 Similar Papers

No similar papers found.

Authors to Follow