Datasets#

MIKASA-Robo-VLA trajectories are available on Hugging Face and can be collected locally. The benchmark ships 22,500 trajectories (250 per task × 90 tasks) and more than 6,000,000 timesteps across three interoperable formats.

Choosing a Format#

Format

Random access

Framework

Best for

NPZ (raw episodes)

No (sequential)

NumPy / PyTorch

Inspecting raw collection output; custom preprocessing.

RLDS / TFDS

Yes

TensorFlow, JAX, Open-X pipelines

Open X-Embodiment style VLA training; any TF Dataset pipeline.

LeRobotDataset v3

Yes

PyTorch, HuggingFace

LeRobot-based imitation learning; modern VLA fine-tuning (OpenVLA-OFT, π0.5).

For most users, LeRobotDataset v3 is the recommended starting point because it stores images as MP4 streams (compact), proprioception and actions as Parquet files (fast random access), and is compatible with the latest VLA training frameworks.

Public Datasets on Hugging Face#

The full 90-task collection is hosted on Hugging Face. Each task is a separate dataset repository so you can download only the split you need (see Benchmarking for split definitions).

Download a single task with huggingface_hub:

from huggingface_hub import snapshot_download

snapshot_download(
    repo_id="mikasa-robo/mikasa-robo-vla-lerobot",
    repo_type="dataset",
    allow_patterns="RememberColor3-VLA-v0/**",
    local_dir="data_mikasa_robo/data_lerobot",
)

Download an entire split by listing the task names from mikasa_robo_vla_envs.csv. The CSV is semicolon-delimited with the following header:

id;Name;Max Length;Memory Type;Horizon Split;Data Source;language_instruction
1;RememberColor3-VLA-v0;30;Object;Short;PPO;"Observe the cube's color ..."

Iterate over the rows you want:

import csv
from huggingface_hub import snapshot_download

with open("mikasa_robo_vla_envs.csv") as f:
    tasks = [r["Name"] for r in csv.DictReader(f, delimiter=";")
             if r["Horizon Split"] == "Short"]

for task in tasks:
    snapshot_download(
        repo_id="mikasa-robo/mikasa-robo-vla-lerobot",
        repo_type="dataset",
        allow_patterns=f"{task}/**",
        local_dir="data_mikasa_robo/data_lerobot",
    )

Training Signal Format#

Every dataset row stores the same canonical signals used by the online wrapped environment: rgb ([T, 128, 128, 6] uint8 — two cameras concatenated on the channel axis), proprio ([T, 7] float32 — absolute EEF pose + gripper opening), action ([T, 7] float32 in the normalised pd_ee_delta_pose range [-1, 1]), reward, success, and the natural-language language_instruction.

The field-by-field reference (units, ranges, the difference between proprio[..., 6] and action[6]) is in Observation and Action Space.

Dataset Layout#

NPZ Source Layout#

data_mikasa_robo/
  data_npz/
    <task>/
      train_data_000000.npz
      train_data_000001.npz
      ...
      metadata.json

Each NPZ episode stores: RGB observations, 7D end-effector proprioception, 7D pd_ee_delta_pose actions, rewards, success/done flags, and the language instruction.

RLDS Layout#

data_mikasa_robo/
  data_rlds/
    <task>/
      1.0.0/

RLDS stores trajectories as TensorFlow Datasets episodes.

LeRobotDataset v3 Layout#

data_mikasa_robo/
  data_lerobot/
    <task>/
      data/          ← Parquet shards (proprio, action, reward, success)
      meta/          ← info.json, stats.json, tasks.jsonl, episodes.jsonl
      videos/        ← MP4 streams (observation.image, observation.wrist_image)

Collecting Your Own Datasets#

Choose a collector based on the task’s Data Source column in mikasa_robo_vla_envs.csv:

  • Data Source = PPO — use get_mikasa_robo_datasets.py. Requires a trained PPO oracle checkpoint (see baselines/ppo/); recommended for tasks that PPO solves reliably (most Object / Spatial / Sequential tasks).

  • Data Source = MP — use get_mikasa_robo_datasets_motion_planning.py. Uses scripted motion planning instead of an RL oracle; recommended for Procedural and Prospective tasks where PPO does not converge (e.g. TraceShape, GatherAndRecall).

Both collectors emit NPZ episodes in the same layout, so the downstream RLDS and LeRobotDataset v3 conversion steps are identical.

Collect NPZ#

PPO-oracle task:

uv run python mikasa_robo_suite/vla/dataset_collectors/get_mikasa_robo_datasets.py \
  --env-id RememberColor3-VLA-v0 \
  --path-to-save-data data_mikasa_robo \
  --ckpt-dir . \
  --num-train-data 250

Motion-planning task:

uv run python mikasa_robo_suite/vla/dataset_collectors/get_mikasa_robo_datasets_motion_planning.py \
  --env-id TraceShapeHard-VLA-v0 \
  --path-to-save-data data_mikasa_robo \
  --num-train-data 250 \
  --max-attempts 5000 \
  --seed 0

For parallel collection across multiple GPUs, see the collector README at mikasa_robo_suite/vla/dataset_collectors/README.md.

Data Export Pipeline#

After NPZ collection, export to the training formats:

NPZ source episodes → RLDS / TFDS episodic dataset → LeRobotDataset v3

Convert NPZ to RLDS#

Single task:

uv sync --project utils/convert_npz_to_rlds/rlds_dataset_builder
uv run --project utils/convert_npz_to_rlds/rlds_dataset_builder \
  python utils/convert_npz_to_rlds/convert_npz_task_to_rlds.py \
  --task RememberColor3-VLA-v0 \
  --overwrite-dest

All available NPZ task folders:

for task_dir in data_mikasa_robo/data_npz/*; do
  task="$(basename "$task_dir")"
  uv run --project utils/convert_npz_to_rlds/rlds_dataset_builder \
    python utils/convert_npz_to_rlds/convert_npz_task_to_rlds.py \
    --task "$task" \
    --overwrite-dest
done

Convert RLDS to LeRobotDataset v3#

Single task:

uv sync --project utils/convert_rlds_to_lerobot
uv run --project utils/convert_rlds_to_lerobot \
  python utils/convert_rlds_to_lerobot/convert_rlds_to_lerobot.py \
  --task RememberColor3-VLA-v0 \
  --overwrite-dest

All available RLDS task folders:

uv run --project utils/convert_rlds_to_lerobot \
  python utils/convert_rlds_to_lerobot/convert_rlds_to_lerobot.py \
  --all \
  --overwrite-dest