GO-1 Fine-tuning and Evaluation¶
Contributed by GO-1 Team
This README provides instructions for fine-tuning and evaluating GO-1 model, including data generation, processing, model training, and evaluation.
1. Table of Contents¶
- GO-1 Fine-tuning and Evaluation
- Table of Contents
- Environment Setup
- Data Generation
- Data Processing
- Model Fine-tuning
- Evaluation
- Evaluation Results
2. Environment Setup¶
2.1 1. Install RoboTwin¶
Create the conda environment and install the dependencies for RoboTwin according to the RoboTwin docs.
Then install the extra dependencies:
cd policy/GO1
conda activate RoboTwin
pip install -r requirements.txt
2.2 2. Install GO-1¶
Follow the instructions in the GO-1 repo to set up a separate conda environment for GO-1.
3. Data Generation¶
Follow the RoboTwin docs to generate raw data in RoboTwin format.
Your raw data should be organized as follows:
data/
├── task_name/
│ ├── task_config/
│ │ ├── data/
│ │ │ ├── episode0.hdf5
│ │ │ ├── episode1.hdf5
│ │ │ └── ...
│ │ └── instructions/
│ │ ├── episode0.json
│ │ ├── episode1.json
│ │ └── ...
4. Data Processing¶
4.1 1. Convert RoboTwin Data to HDF5¶
# Activate the RoboTwin environment
conda activate RoboTwin
bash robotwin2hdf5.sh <task_name> <task_config> <expert_data_num>
# Example:
bash robotwin2hdf5.sh beat_block_hammer demo_clean 50
This will create processed data in the processed_data/<task_name>-<task_config>-<expert_data_num>
directory.
4.2 2. Convert HDF5 to LeRobot Dataset¶
# Activate the GO-1 environment
conda activate go1
# Optional: Change the LeRobot home directory
export HF_LEROBOT_HOME=/path/to/your/lerobot
bash hdf52lerobot.sh <hdf5_path> <repo_id>
# Example:
bash hdf52lerobot.sh processed_data/beat_block_hammer-demo_clean-50/ beat_block_hammer_repo
The LeRobot dataset will be saved in <HF_LEROBOT_HOME>/<repo_id>
.
5. Model Fine-tuning¶
Refer to the GO-1 repo for detailed instructions.
6. Evaluation¶
6.1 Start GO-1 Server¶
Start the GO-1 inference server using your fine-tuned model checkpoint and data statistics:
cd /path/to/AgiBot-World
conda activate go1
python evaluate/deploy.py --model_path /path/to/your/checkpoint --data_stats_path /path/to/your/dataset_stats.json --port <SERVER_PORT>
The server will will listen on port SERVER_PORT
and wait for observations.
6.2 Start RoboTwin Client¶
The client requires a separate terminal session. We strongly recommend using tmux
or screen
for this process, as evaluation can take several hours to complete.
First config the client in deploy_policy.yml:
host: Server IP address (default: 127.0.0.1)
port: Server port (default: 9000)
Then use the provided script to evaluate your model:
conda activate RoboTwin
bash eval.sh <task_name> <task_config> <ckpt_setting> <seed> <gpu_id>
# Example:
bash eval.sh beat_block_hammer demo_clean go1_demo 0 0
Arguments: - task_name
- Name of the task (e.g., beat_block_hammer
) - task_config
- Task configuration (e.g., demo_randomized
, demo_clean
) - ckpt_setting
- Checkpoint setting name (default: go1_demo
) - seed
- Random seed (default: 0
) - gpu_id
- GPU ID to use (default: 0
)
Alternatively, you can set these values in deploy_policy.yml.
The evaluation results, including videos and metrics, will be saved in the eval_result/<task_name>/GO1/<task_config>/<ckpt_setting>
directory under the project root.
7. Evaluation Results¶
Following the setup in RoboTwin2.0 Benchmark, we report the performance of GO-1 Air model and other baselines in the table below. All models are trained on the Aloha-AgileX embodiment using 50 demo_clean
demonstrations for 3 selected tasks (grab_roller
, handover_mic
, lift_pot
), and evaluated 100 times under the demo_clean (Easy)
and demo_randomized (Hard)
settings. Our models are fine-tuned for 10k steps.
Policy Task | Grab Roller | Lift Pot | Average | ||
---|---|---|---|---|---|
Easy | Hard | Easy | Hard | ||
DP | 98% | 0% | 39% | 0% | 34.25% |
ACT | 94% | 25% | 88% | 0% | 51.25% |
RDT | 74% | 43% | 72% | 9% | 49.5% |
Pi0 | 96% | 80% | 84% | 36% | 74% |
GO-1 Air | 86% | 94% | 94% | 33% | 76.75% |
GO-1 | 96% | 96% | 94% | 35% | 80.25% |