ExplorerComputer VisionComputer Vision
Research PaperResearchia:202602.27006

VGG-T$^3$: Offline Feed-Forward 3D Reconstruction at Scale

Sven Elflein

Abstract

We present a scalable 3D reconstruction model that addresses a critical limitation in offline feed-forward methods: their computational and memory requirements grow quadratically w.r.t. the number of input images. Our approach is built on the key insight that this bottleneck stems from the varying-length Key-Value (KV) space representation of scene geometry, which we distill into a fixed-size Multi-Layer Perceptron (MLP) via test-time training. VGG-T$^3$ (Visual Geometry Grounded Test Time Train...

Submitted: February 27, 2026Subjects: Computer Vision; Computer Vision

Description / Details

We present a scalable 3D reconstruction model that addresses a critical limitation in offline feed-forward methods: their computational and memory requirements grow quadratically w.r.t. the number of input images. Our approach is built on the key insight that this bottleneck stems from the varying-length Key-Value (KV) space representation of scene geometry, which we distill into a fixed-size Multi-Layer Perceptron (MLP) via test-time training. VGG-T3^3 (Visual Geometry Grounded Test Time Training) scales linearly w.r.t. the number of input views, similar to online models, and reconstructs a 1k1k image collection in just 5454 seconds, achieving a 11.6×11.6\times speed-up over baselines that rely on softmax attention. Since our method retains global scene aggregation capability, our point map reconstruction error outperforming other linear-time methods by large margins. Finally, we demonstrate visual localization capabilities of our model by querying the scene representation with unseen images.


Source: arXiv:2602.23361v1 - http://arxiv.org/abs/2602.23361v1 PDF: https://arxiv.org/pdf/2602.23361v1 Original Link: http://arxiv.org/abs/2602.23361v1

Please sign in to join the discussion.

No comments yet. Be the first to share your thoughts!

Access Paper
View Source PDF
Submission Info
Date:
Feb 27, 2026
Topic:
Computer Vision
Area:
Computer Vision
Comments:
0
Bookmark
VGG-T$^3$: Offline Feed-Forward 3D Reconstruction at Scale | Researchia