ExplorerRoboticsRobotics
Research PaperResearchia:202605.01081

MotuBrain: An Advanced World Action Model for Robot Control

MotuBrain Team

Abstract

Vision-Language-Action (VLA) models achieve strong semantic generalization but often lack fine-grained modeling of world dynamics. Recent work explores video generation models as a foundation for world modeling, leading to unified World Action Models (WAMs) that jointly model visual dynamics and actions. We present MotuBrain, a unified multimodal generative model that jointly models video and action under a UniDiffuser formulation with a three-stream Mixture-of-Transformers architecture. A singl...

Submitted: May 1, 2026Subjects: Robotics; Robotics

Description / Details

Vision-Language-Action (VLA) models achieve strong semantic generalization but often lack fine-grained modeling of world dynamics. Recent work explores video generation models as a foundation for world modeling, leading to unified World Action Models (WAMs) that jointly model visual dynamics and actions. We present MotuBrain, a unified multimodal generative model that jointly models video and action under a UniDiffuser formulation with a three-stream Mixture-of-Transformers architecture. A single model supports multiple inference modes, including policy learning, world modeling, video generation, inverse dynamics, and joint video-action prediction, while scaling to heterogeneous multimodal data such as video-only and cross-embodiment robot data. To improve real-world applicability, MotuBrain introduces a unified multiview representation, explicit language-action coupling, and an efficient inference stack, achieving over 50x speedup for real-time deployment.


Source: arXiv:2604.27792v1 - http://arxiv.org/abs/2604.27792v1 PDF: https://arxiv.org/pdf/2604.27792v1 Original Link: http://arxiv.org/abs/2604.27792v1

Please sign in to join the discussion.

No comments yet. Be the first to share your thoughts!

Access Paper
View Source PDF
Submission Info
Date:
May 1, 2026
Topic:
Robotics
Area:
Robotics
Comments:
0
Bookmark
MotuBrain: An Advanced World Action Model for Robot Control | Researchia