Dataset Collectors API#

MIKASA-Robo-VLA provides two data collection methods:

PPO Oracle (get_mikasa_robo_datasets.py)

Uses a pre-trained PPO checkpoint to collect expert trajectories. Suitable for all tasks where an oracle checkpoint is available (see the Source column in the Environments & Tasks table).

Motion Planning (get_mikasa_robo_datasets_motion_planning.py)

Uses a geometric planner instead of a learned policy. Required for tasks that cannot be solved by the PPO oracle (e.g. TraceShape).

Collection commands and the full pipeline are documented in Datasets. The README.md inside mikasa_robo_suite/vla/dataset_collectors/ contains additional notes on parallel collection, checkpointing, and resuming interrupted runs.

PPO Oracle Collector#

class Args(env_id: str | None = 'ShellGameTouch-VLA-v0', path_to_save_data: str = 'data_mikasa_robo', ckpt_dir: str = '.', num_train_data: int = 250)[source]#

Bases: object

ckpt_dir: str = '.'#
env_id: str | None = 'ShellGameTouch-VLA-v0'#
num_train_data: int = 250#
path_to_save_data: str = 'data_mikasa_robo'#
collect_batched_data_from_ckpt(env_id: str = 'ShellGameTouch-VLA-v0', checkpoint_path: str | None = None, path_to_save_data: str = 'data_mikasa_robo', num_train_data: int = 250)[source]#

Collect episodes in batches; keep a batch only if all episodes are successful.

collect_for_env(env_id: str, checkpoint: str, path_to_save_data: str, num_train_data: int)[source]#
collect_unbatched_data_from_batched(env_id: str = 'ShellGameTouch-VLA-v0', path_to_save_data: str = 'data_mikasa_robo')[source]#
env_info(env_id: str)[source]#
get_list_of_all_checkpoints_available(ckpt_dir: str = '.')[source]#
maybe_remove_empty_batched_dir(env_id: str, path_to_save_data: str = 'data_mikasa_robo')[source]#
npz_layout_roots(path_to_save_data: str) tuple[Path, Path][source]#

Motion-Planning Collector#

BATCHED_TMP_SUBDIR = '_batched'#

python mikasa_robo_suite/vla/dataset_collectors/get_mikasa_robo_datasets_motion_planning.py –env-id BlinkCountButtonPressEasy-VLA-v0 –path-to-save-data data_mikasa_robo –num-train-data 250 –max-attempts 5000

python mikasa_robo_suite/vla/dataset_collectors/get_mikasa_robo_datasets_motion_planning.py –env-id TraceShapeHard-VLA-v0 –path-to-save-data data_mikasa_robo –num-train-data 250 –max-attempts 5000

python mikasa_robo_suite/vla/dataset_collectors/get_mikasa_robo_datasets_motion_planning.py –env-id TraceShapeSeqHard-VLA-v0 –path-to-save-data data_mikasa_robo –num-train-data 250 –max-attempts 5000

python mikasa_robo_suite/vla/dataset_collectors/get_mikasa_robo_datasets_motion_planning.py –env-id GatherAndRecall9-VLA-v0 –path-to-save-data data_mikasa_robo –num-train-data 250 –max-attempts 5000

collect_batched_motion_planning(env_id: str, path_to_save_data: str, num_train_data: int, max_attempts: int, seed: int)[source]#
collect_unbatched_data_from_batched(env_id: str, path_to_save_data: str)[source]#
load_language_commands() Dict[str, str][source]#
main()[source]#
maybe_remove_empty_batched_dir(env_id: str, path_to_save_data: str)[source]#
npz_layout_roots(path_to_save_data: str) Tuple[Path, Path][source]#
parse_args()[source]#
planner_script_for_env(env_id: str) Path[source]#
resolve_language_instruction(env_id: str, language_map: Dict[str, str]) str | None[source]#