Tiny-VLA (Towards Fast, Data-Efficient Vision-Language-Action Models for Robotic Manipulation)¶
Contributed by Midea Group
1. Install¶
To guarantee clean isolation between training and evaluation environments for both DexVLA and TinyVLA, we provide two distinct, self-contained setups.The training and testing environment can be used for both DexVLA and TinyVLA.
Training Environment:
cd policy/TinyVLA
conda env create -f Train_Tiny_DexVLA_train.yml
conda activate dexvla-robo
cd policy_heads
pip install -e .
If you already have RoboTwin 2.0 installed, activate its conda environment and add the evaluation dependencies:
conda activate your_RoboTwin_env
pip install -r Eval_Tiny_DexVLA_requirements.txt
2. Prepare Training Data¶
This step performs data preprocessing, converting the original RoboTwin 2.0 data into the format required for TinyVLA training. The expert_data_num
parameter specifies the number of trajectory pairs to be used as training data.
python process_data.py ${task_name} ${task_config} ${expert_data_num}
# python process_data.py beat_block_hammer demo_randomized 50
sim_${task_name}/${setting}_${expert_data_num}
folder under policy/Tinyvla/data
. 3. Train Policy¶
This step launches the training process. First, download the VLM model InternVL3-1B (huggingface) to the path .../policy/TinyVLA/model_param/InternVL3-1B
. Then modify the config.json
file in the folder as follows:
{
"_name_or_path": ".../robotiwin/policy/TinyVLA/vla/models/internvl", # Modify this.
"architectures": [
"TinyVLA" # Change this.
],
# "auto_map":{...} # Delete this.
...
"llm_config": {}, # Don't Change.
"min_dynamic_patch": 1,
"model_type": "tinyvla", # Change this.
...
}
.../policy/TinyVLA/aloha_scripts/constants.py
TASK_CONFIGS = {
...
"your_task": {
'dataset_dir': [DATA_DIR + "/sim-your_task/aloha-agilex-1-m1_b1_l1_h0.03_c0_D435-100"],
'episode_len': 500,
'camera_names': ['cam_high', 'cam_left_wrist', 'cam_right_wrist'],
"sample_weights": [1, 1]
}
}
bash ./scripts/franks/train_robotwin_aloha.sh
train_robotwin_aloha.sh
file. TASK=your_task # Set the Task
ROOT=.../robotiwin/policy/TinyVLA # Set Root Path
mnop=.../robotiwin/policy/TinyVLA/model_param/InternVL3-1B/ # Set The Path of base VLM
4. Eval Policy¶
You need to modify the corresponding path in the deploy_policy.yml
file: 1. model_path : Path to the trained model, in the OUTPUT path. 2. state_path : Path to dataset_stats.pkl
, in the OUTPUT path. 3. model_base : Path to InternVL3-1B.
Then execute:
bash eval.sh ${task_name} ${task_config} ${ckpt_setting} ${expert_data_num} ${seed} ${gpu_id}
# bash eval.sh beat_block_hammer demo_randomized 0 50 0 0
5. Citation¶
If you find Tiny-VLA useful for your research and applications, please cite using this BibTeX:
@inproceedings{wen2024tinyvla,
title={Tinyvla: Towards fast, data-efficient vision-language-action models for robotic manipulation},
author={Wen, Junjie and Zhu, Yichen and Li, Jinming and Zhu, Minjie and Wu, Kun and Xu, Zhiyuan and Liu, Ning and Cheng, Ran and Shen, Chaomin and Peng, Yaxin and others},
booktitle={IEEE Robotics and Automation Letters (RA-L)},
year={2025}
}