DP (Diffusion Policy)¶
1. Install¶
pip install zarr==2.12.0 wandb ipdb gpustat dm_control omegaconf hydra-core==1.2.0 dill==0.3.5.1 einops==0.4.1 diffusers==0.11.1 numba==0.56.4 moviepy imageio av matplotlib termcolor
cd policy/DP
pip install -e .
cd ../..
2. Prepare Training Data¶
This step performs data preprocessing, converting the original RoboTwin 2.0 data into the Zarr format required for DP training. The expert_data_num
parameter specifies the number of trajectory pairs to be used as training data.
bash process_data.sh ${task_name} ${task_config} ${expert_data_num}
3. Train Policy¶
This step launches the training process. By default, the model is trained for 600 steps. The action_dim
parameter defines the dimensionality of the robot’s action space — for example, it is 14 for the aloha-agilex
embodiment.
bash train.sh ${task_name} ${task_config} ${expert_data_num} ${seed} ${action_dim} ${gpu_id}
4. Eval Policy¶
The task_config
field refers to the evaluation environment configuration, while the ckpt_setting
field refers to the training data configuration used during policy learning.
bash eval.sh ${task_name} ${task_config} ${ckpt_setting} ${expert_data_num} ${seed} ${gpu_id}
The evaluation results, including videos, will be saved in the eval_result
directory under the project root.