Explorerβ€ΊRoboticsβ€ΊRobotics
Research PaperResearchia:202606.17011

EBench: Elemental Diagnosis of Generalist Mobile Manipulation Policies

Ning Gao

Abstract

We present EBench, a simulation benchmark that diagnoses generalist mobile manipulation policies beyond a single success-rate scalar. EBench comprises 26 diverse and challenging manipulation tasks annotated along 5 capability dimensions and 4 generalization dimensions. We evaluate state-of-the-art generalist manipulation models including $Ο€_0$, $Ο€_{0.5}$, XVLA, and InternVLA-A1, and reveal that models with near success rates exhibit strikingly different capability profiles: $Ο€_{0.5}$ achieves th...

Submitted: June 17, 2026Subjects: Robotics; Robotics

Description / Details

We present EBench, a simulation benchmark that diagnoses generalist mobile manipulation policies beyond a single success-rate scalar. EBench comprises 26 diverse and challenging manipulation tasks annotated along 5 capability dimensions and 4 generalization dimensions. We evaluate state-of-the-art generalist manipulation models including Ο€0Ο€_0, Ο€0.5Ο€_{0.5}, XVLA, and InternVLA-A1, and reveal that models with near success rates exhibit strikingly different capability profiles: Ο€0.5Ο€_{0.5} achieves the highest test success rate and the best train--test retention, whereas InternVLA-A1 dominates mobile manipulation but collapses on dexterous tasks, and XVLA exhibits strengths on a disjoint set of atomic skills compared to other policies. Beyond capability profiling, EBench analyzes the generalization ability from 4 representative perspectives, identifying the impact of different distribution shift factors. The results reveal strengths and weaknesses of models behind an overall score. We hope this benchmark offers a broad set of diagnostic signals to guide iteration on generalist manipulation models.


Source: arXiv:2606.18239v1 - http://arxiv.org/abs/2606.18239v1 PDF: https://arxiv.org/pdf/2606.18239v1 Original Link: http://arxiv.org/abs/2606.18239v1

Please sign in to join the discussion.

No comments yet. Be the first to share your thoughts!

Access Paper
View Source PDF
Submission Info
Date:
Jun 17, 2026
Topic:
Robotics
Area:
Robotics
Comments:
0
Bookmark