Skip to content

GO-1 Fine-tuning and Evaluation

Contributed by GO-1 Team

This README provides instructions for fine-tuning and evaluating GO-1 model, including data generation, processing, model training, and evaluation.

1. Table of Contents

2. Environment Setup

2.1 1. Install RoboTwin

Create the conda environment and install the dependencies for RoboTwin according to the RoboTwin docs.

Then install the extra dependencies:

cd policy/GO1

conda activate RoboTwin
pip install -r requirements.txt

2.2 2. Install GO-1

Follow the instructions in the GO-1 repo to set up a separate conda environment for GO-1.

3. Data Generation

Follow the RoboTwin docs to generate raw data in RoboTwin format.

Your raw data should be organized as follows:

data/
├── task_name/
│   ├── task_config/
│   │   ├── data/
│   │   │   ├── episode0.hdf5
│   │   │   ├── episode1.hdf5
│   │   │   └── ...
│   │   └── instructions/
│   │       ├── episode0.json
│   │       ├── episode1.json
│   │       └── ...

4. Data Processing

4.1 1. Convert RoboTwin Data to HDF5

# Activate the RoboTwin environment
conda activate RoboTwin

bash robotwin2hdf5.sh <task_name> <task_config> <expert_data_num>

# Example:
bash robotwin2hdf5.sh beat_block_hammer demo_clean 50

This will create processed data in the processed_data/<task_name>-<task_config>-<expert_data_num> directory.

4.2 2. Convert HDF5 to LeRobot Dataset

# Activate the GO-1 environment
conda activate go1

# Optional: Change the LeRobot home directory
export HF_LEROBOT_HOME=/path/to/your/lerobot

bash hdf52lerobot.sh <hdf5_path> <repo_id>

# Example:
bash hdf52lerobot.sh processed_data/beat_block_hammer-demo_clean-50/ beat_block_hammer_repo

The LeRobot dataset will be saved in <HF_LEROBOT_HOME>/<repo_id>.

5. Model Fine-tuning

Refer to the GO-1 repo for detailed instructions.

6. Evaluation

6.1 Start GO-1 Server

Start the GO-1 inference server using your fine-tuned model checkpoint and data statistics:

cd /path/to/AgiBot-World

conda activate go1

python evaluate/deploy.py --model_path /path/to/your/checkpoint --data_stats_path /path/to/your/dataset_stats.json --port <SERVER_PORT>

The server will will listen on port SERVER_PORT and wait for observations.

6.2 Start RoboTwin Client

The client requires a separate terminal session. We strongly recommend using tmux or screen for this process, as evaluation can take several hours to complete.

First config the client in deploy_policy.yml:

host: Server IP address (default: 127.0.0.1)
port: Server port (default: 9000)

Then use the provided script to evaluate your model:

conda activate RoboTwin

bash eval.sh <task_name> <task_config> <ckpt_setting> <seed> <gpu_id>

# Example:
bash eval.sh beat_block_hammer demo_clean go1_demo 0 0

Arguments: - task_name - Name of the task (e.g., beat_block_hammer) - task_config - Task configuration (e.g., demo_randomized, demo_clean) - ckpt_setting - Checkpoint setting name (default: go1_demo) - seed - Random seed (default: 0) - gpu_id - GPU ID to use (default: 0)

Alternatively, you can set these values in deploy_policy.yml.

The evaluation results, including videos and metrics, will be saved in the eval_result/<task_name>/GO1/<task_config>/<ckpt_setting> directory under the project root.

7. Evaluation Results

Following the setup in RoboTwin2.0 Benchmark, we report the performance of GO-1 Air model and other baselines in the table below. All models are trained on the Aloha-AgileX embodiment using 50 demo_clean demonstrations for 3 selected tasks (grab_roller, handover_mic, lift_pot), and evaluated 100 times under the demo_clean (Easy) and demo_randomized (Hard) settings. Our models are fine-tuned for 10k steps.

Policy Task Grab Roller Lift Pot Average
Easy Hard Easy Hard
DP 98% 0% 39% 0% 34.25%
ACT 94% 25% 88% 0% 51.25%
RDT 74% 43% 72% 9% 49.5%
Pi0 96% 80% 84% 36% 74%
GO-1 Air 86% 94% 94% 33% 76.75%
GO-1 96% 96% 94% 35% 80.25%