ExplorerChemical EngineeringEngineering
Research PaperResearchia:202605.20036

CAT-MoEformer: Context-Aware Temporal MoE Transformer for Beam Prediction

Changkai Zhou

Abstract

This paper proposes CAT-MoEformer, a context-aware transformer with scene-conditioned mixture-of-experts (MoE) feed-forward networks, for proactive mmWave beam prediction from compressed uplink pilot observations. The spatial encoder comprises a three-layer asymmetric convolutional network followed by a squeeze-and-excitation recalibration block, which extracts frequency-beam correlation features from pilot tensors without explicit channel reconstruction. A truncated pretrained GPT-2 backbone mo...

Submitted: May 20, 2026Subjects: Engineering; Chemical Engineering

Description / Details

This paper proposes CAT-MoEformer, a context-aware transformer with scene-conditioned mixture-of-experts (MoE) feed-forward networks, for proactive mmWave beam prediction from compressed uplink pilot observations. The spatial encoder comprises a three-layer asymmetric convolutional network followed by a squeeze-and-excitation recalibration block, which extracts frequency-beam correlation features from pilot tensors without explicit channel reconstruction. A truncated pretrained GPT-2 backbone models the temporal evolution of beam sequences, with the feed-forward networks in the upper three transformer layers replaced by scene-conditioned MoE-FFN modules. A lightweight gating network maps the scenario label and normalized user equipment speed to expert mixing weights, conditioning the routing decision on physical propagation descriptors rather than on latent hidden states. This design yields interpretable expert assignments and eliminates the load imbalance associated with token-level routing. To prevent expert collapse under soft routing, a three-stage training strategy is introduced: hard expert assignment in the first stage establishes scene-specific specialization, isolated gating network training in the second stage aligns the soft routing distribution with the hard partition, and top-1 hard inference in the third stage fine-tunes the model under deterministic single-expert activation to maximize scene-specific precision. Simulation results on 3GPP TR 38.901 Urban Macro channel simulations with 64,00064{,}000 user samples demonstrate that CAT-MoEformer achieves a Top-1 beam prediction accuracy of 94.88%94.88\% and a beam switching instant accuracy of 80.62%80.62\%, representing gains of 2.33%2.33\% and 9.55%9.55\% respectively over a CNN+GPT-2 baseline, with an inference latency of 0.520.52~ms.


Source: arXiv:2605.19997v1 - http://arxiv.org/abs/2605.19997v1 PDF: https://arxiv.org/pdf/2605.19997v1 Original Link: http://arxiv.org/abs/2605.19997v1

Please sign in to join the discussion.

No comments yet. Be the first to share your thoughts!

Access Paper
View Source PDF
Submission Info
Date:
May 20, 2026
Topic:
Chemical Engineering
Area:
Engineering
Comments:
0
Bookmark