MIKASA-Robo-VLA Documentation ============================== .. raw:: html

Find Imposter Shape And Color render preview

Shell Game Color Lamp Touch render preview

Shell Game Shuffle Color Lamp Touch render preview

.. raw:: html

Quick Links: :doc:`installation` · :doc:`quickstart` · :doc:`benchmarking` · :doc:`datasets` · `Cite <#citation>`_ MIKASA-Robo-VLA significantly extends `MIKASA-Robo `_ to the VLA setting. It preserves the original benchmark’s focus on memory-intensive tabletop manipulation, while broadening the task suite, introducing language-conditioned evaluation, and providing standardized data export for modern VLA training pipelines. What changed from MIKASA-Robo (RL release) - Task set grows from **32 → 90** registered environments covering 10 memory types (vs 4 in the RL release). - Every task ships a natural-language ``LANGUAGE_INSTRUCTION`` for VLA conditioning. - Episodes are grouped into three **horizon splits** (Short / Medium / Long) so multi-task training and evaluation are tractable. - 22,500 PPO / motion-planning oracle trajectories are released on Hugging Face in RLDS and LeRobotDataset v3 formats — no further conversion needed (6+ millions of transitions). - Dense and normalised-dense rewards are calibrated for every task, enabling both offline imitation learning and online RL. - The original 32-task RL implementation is available from the `mikasa-robo-rl branch `_ and remains under ``mikasa_robo_suite/rl/`` for backwards compatibility. Pick your path -------------- - *"I want to evaluate my VLA model"* → :doc:`benchmarking` (CLI, JSON output, Python API) and the canonical :doc:`evaluation_protocol`. - *"I want to fine-tune a VLA model"* → :doc:`datasets` (RLDS, LeRobotDataset v3) and :doc:`observation_space`. - *"I want to explore tasks"* → :doc:`vla_environments/index` (per-task pages with previews, language instuctions, horizons, and setup parameters). - *"I want to know what makes the benchmark important"* → :doc:`concepts` (memory taxonomy, episode structure). Key Features ------------ - **90 memory tasks** across 10 memory types, horizons 25–2160 steps, multiple difficulty levels. - The public benchmark grows from 32 (RL release) to **90 tasks** with language instructions for every task. - Three horizon **splits** (Short / Medium / Long) for structured multi-task evaluation. - Trajectory collection via PPO oracles and motion planning. - **22,500 trajectories** (>6 M timesteps) in RLDS and LeRobotDataset v3 formats on Hugging Face. - Physics fixes, dense / normalised-dense rewards, and full GPU-parallelised simulation via ManiSkill. .. toctree:: :maxdepth: 2 :caption: Getting Started installation quickstart concepts .. toctree:: :maxdepth: 2 :caption: The Benchmark vla_environments/index benchmarking evaluation_protocol observation_space datasets .. toctree:: :maxdepth: 2 :caption: Guides wrappers_cookbook .. toctree:: :maxdepth: 2 :caption: Reference api/wrappers api/collectors api/envs faq Citation -------- If you use MIKASA-Robo-VLA in your research, please cite: .. code-block:: bibtex @inproceedings{cherepanov2026memory, title = {Memory, Benchmark \& Robots: A Benchmark for Solving Complex Tasks with Reinforcement Learning}, author = {Egor Cherepanov and Nikita Kachaev and Alexey Kovalev and Aleksandr Panov}, booktitle = {The Fourteenth International Conference on Learning Representations}, year = {2026}, url = {https://openreview.net/forum?id=9cLPurIZMj} } Legacy RL Version ----------------- .. note:: If you need the original RL benchmark from the MIKASA-Robo paper (`arXiv:2502.10550 `_), install ``mikasa-robo-suite==0.0.5`` from PyPI or use the `mikasa-robo-rl branch `_. New development targets MIKASA-Robo-VLA. The previous 32-environment RL implementation is still kept under ``mikasa_robo_suite/rl/`` for compatibility.