Back to Explorer
Research PaperResearchia:202601.29097[Robotics > Robotics]

Nimbus: A Unified Embodied Synthetic Data Generation Framework

Zeyu He

Abstract

Scaling data volume and diversity is critical for generalizing embodied intelligence. While synthetic data generation offers a scalable alternative to expensive physical data acquisition, existing pipelines remain fragmented and task-specific. This isolation leads to significant engineering inefficiency and system instability, failing to support the sustained, high-throughput data generation required for foundation model training. To address these challenges, we present Nimbus, a unified synthetic data generation framework designed to integrate heterogeneous navigation and manipulation pipelines. Nimbus introduces a modular four-layer architecture featuring a decoupled execution model that separates trajectory planning, rendering, and storage into asynchronous stages. By implementing dynamic pipeline scheduling, global load balancing, distributed fault tolerance, and backend-specific rendering optimizations, the system maximizes resource utilization across CPU, GPU, and I/O resources. Our evaluation demonstrates that Nimbus achieves a 2-3X improvement in end-to-end throughput compared to unoptimized baselines and ensuring robust, long-term operation in large-scale distributed environments. This framework serves as the production backbone for the InternData suite, enabling seamless cross-domain data synthesis.


Source: arXiv:2601.21449v1 - http://arxiv.org/abs/2601.21449v1 PDF: https://arxiv.org/pdf/2601.21449v1 Original Link: http://arxiv.org/abs/2601.21449v1

Submission:1/29/2026
Comments:0 comments
Subjects:Robotics; Robotics
Original Source:
View Original PDF
arXiv: This paper is hosted on arXiv, an open-access repository
Was this helpful?

Discussion (0)

Please sign in to join the discussion.

No comments yet. Be the first to share your thoughts!

Nimbus: A Unified Embodied Synthetic Data Generation Framework | Researchia