ExplorerComputer VisionComputer Vision
Research PaperResearchia:202604.24007

Seeing Without Eyes: 4D Human-Scene Understanding from Wearable IMUs

Hao-Yu Hsu

Abstract

Understanding human activities and their surrounding environments typically relies on visual perception, yet cameras pose persistent challenges in privacy, safety, energy efficiency, and scalability. We explore an alternative: 4D perception without vision. Its goal is to reconstruct human motion and 3D scene layouts purely from everyday wearable sensors. For this we introduce IMU-to-4D, a framework that repurposes large language models for non-visual spatiotemporal understanding of human-scene d...

Submitted: April 24, 2026Subjects: Computer Vision; Computer Vision

Description / Details

Understanding human activities and their surrounding environments typically relies on visual perception, yet cameras pose persistent challenges in privacy, safety, energy efficiency, and scalability. We explore an alternative: 4D perception without vision. Its goal is to reconstruct human motion and 3D scene layouts purely from everyday wearable sensors. For this we introduce IMU-to-4D, a framework that repurposes large language models for non-visual spatiotemporal understanding of human-scene dynamics. IMU-to-4D uses data from a few inertial sensors from earbuds, watches, or smartphones and predicts detailed 4D human motion together with coarse scene structure. Experiments across diverse human-scene datasets show that IMU-to-4D yields more coherent and temporally stable results than SoTA cascaded pipelines, suggesting wearable motion sensors alone can support rich 4D understanding.


Source: arXiv:2604.21926v1 - http://arxiv.org/abs/2604.21926v1 PDF: https://arxiv.org/pdf/2604.21926v1 Original Link: http://arxiv.org/abs/2604.21926v1

Please sign in to join the discussion.

No comments yet. Be the first to share your thoughts!

Access Paper
View Source PDF
Submission Info
Date:
Apr 24, 2026
Topic:
Computer Vision
Area:
Computer Vision
Comments:
0
Bookmark
Seeing Without Eyes: 4D Human-Scene Understanding from Wearable IMUs | Researchia