CABLE: Cloud-Assisted Bandwidth-efficient LMM-based Encoding for V2X Systems
Abstract
Cloud-hosted large multimodal models (LMMs) can provide strong open-vocabulary perception for Vehicle-to-Everything systems, but naively transmitting full-resolution frames from edge to cloud causes severe communication overhead and high cloud-side prefill latency. We present CABLE, a cloud-assisted bandwidth-efficient LMM-based encoding framework for edge-cloud perception. CABLE propagates the previous cloud segmentation mask on the edge using ego-motion compensation, refines it with residual-m...
Description / Details
Cloud-hosted large multimodal models (LMMs) can provide strong open-vocabulary perception for Vehicle-to-Everything systems, but naively transmitting full-resolution frames from edge to cloud causes severe communication overhead and high cloud-side prefill latency. We present CABLE, a cloud-assisted bandwidth-efficient LMM-based encoding framework for edge-cloud perception. CABLE propagates the previous cloud segmentation mask on the edge using ego-motion compensation, refines it with residual-motion cues, and consolidates disconnected regions via a corridor envelope to form a robust region of interest (ROI). Only ROI-masked images are uploaded, while the cloud segmentation output is fed back as the prior for the next frame, forming a mask-to-ROI-to-LMM feedback loop. Experiments on five datasets (nuScenes, WOD-ZB, Waymo, KITTI, and CADC) show consistent communication savings while largely preserving perception, achieving -- ROI pixel-coverage reduction with -- estimated LMM prefill speedup at a modest detection-quality trade-off relative to full-frame inference.
Source: arXiv:2606.19258v1 - http://arxiv.org/abs/2606.19258v1 PDF: https://arxiv.org/pdf/2606.19258v1 Original Link: http://arxiv.org/abs/2606.19258v1
Please sign in to join the discussion.
No comments yet. Be the first to share your thoughts!
Jun 18, 2026
Robotics
Robotics
0