GO-1 Fine-tuning and Evaluation¶

Contributed by GO-1 Team

This README provides instructions for fine-tuning and evaluating GO-1 model, including data generation, processing, model training, and evaluation.

1. Table of Contents¶

GO-1 Fine-tuning and Evaluation
Table of Contents
Environment Setup
- 1. Install RoboTwin
- 2. Install GO-1
Data Generation
Data Processing
- 1. Convert RoboTwin Data to HDF5
- 2. Convert HDF5 to LeRobot Dataset
Model Fine-tuning
Evaluation
- Start GO-1 Server
- Start RoboTwin Client
Evaluation Results

2. Environment Setup¶

2.1 1. Install RoboTwin¶

Create the conda environment and install the dependencies for RoboTwin according to the RoboTwin docs.

Then install the extra dependencies:

cd policy/GO1

conda activate RoboTwin
pip install -r requirements.txt

2.2 2. Install GO-1¶

Follow the instructions in the GO-1 repo to set up a separate conda environment for GO-1.

3. Data Generation¶

Follow the RoboTwin docs to generate raw data in RoboTwin format.

Your raw data should be organized as follows:

data/
├── task_name/
│   ├── task_config/
│   │   ├── data/
│   │   │   ├── episode0.hdf5
│   │   │   ├── episode1.hdf5
│   │   │   └── ...
│   │   └── instructions/
│   │       ├── episode0.json
│   │       ├── episode1.json
│   │       └── ...

4. Data Processing¶

4.1 1. Convert RoboTwin Data to HDF5¶

# Activate the RoboTwin environment
conda activate RoboTwin

bash robotwin2hdf5.sh <task_name> <task_config> <expert_data_num>

# Example:
bash robotwin2hdf5.sh beat_block_hammer demo_clean 50

This will create processed data in the processed_data/<task_name>-<task_config>-<expert_data_num> directory.

4.2 2. Convert HDF5 to LeRobot Dataset¶

# Activate the GO-1 environment
conda activate go1

# Optional: Change the LeRobot home directory
export HF_LEROBOT_HOME=/path/to/your/lerobot

bash hdf52lerobot.sh <hdf5_path> <repo_id>

# Example:
bash hdf52lerobot.sh processed_data/beat_block_hammer-demo_clean-50/ beat_block_hammer_repo

The LeRobot dataset will be saved in <HF_LEROBOT_HOME>/<repo_id>.

5. Model Fine-tuning¶

Refer to the GO-1 repo for detailed instructions.

6. Evaluation¶

6.1 Start GO-1 Server¶

Start the GO-1 inference server using your fine-tuned model checkpoint and data statistics:

cd /path/to/AgiBot-World

conda activate go1

python evaluate/deploy.py --model_path /path/to/your/checkpoint --data_stats_path /path/to/your/dataset_stats.json --port <SERVER_PORT>

The server will will listen on port SERVER_PORT and wait for observations.

6.2 Start RoboTwin Client¶

The client requires a separate terminal session. We strongly recommend using tmux or screen for this process, as evaluation can take several hours to complete.

First config the client in deploy_policy.yml:

host: Server IP address (default: 127.0.0.1)
port: Server port (default: 9000)

Then use the provided script to evaluate your model:

conda activate RoboTwin

bash eval.sh <task_name> <task_config> <ckpt_setting> <seed> <gpu_id>

# Example:
bash eval.sh beat_block_hammer demo_clean go1_demo 0 0

Arguments: - task_name - Name of the task (e.g., beat_block_hammer) - task_config - Task configuration (e.g., demo_randomized, demo_clean) - ckpt_setting - Checkpoint setting name (default: go1_demo) - seed - Random seed (default: 0) - gpu_id - GPU ID to use (default: 0)

Alternatively, you can set these values in deploy_policy.yml.

The evaluation results, including videos and metrics, will be saved in the eval_result/<task_name>/GO1/<task_config>/<ckpt_setting> directory under the project root.

7. Evaluation Results¶

Following the setup in RoboTwin2.0 Benchmark, we report the performance of GO-1 Air model and other baselines in the table below. All models are trained on the Aloha-AgileX embodiment using 50 demo_clean demonstrations for 3 selected tasks (grab_roller, handover_mic, lift_pot), and evaluated 100 times under the demo_clean (Easy) and demo_randomized (Hard) settings. Our models are fine-tuned for 10k steps.

Policy Task	Grab Roller		Lift Pot		Average
	Easy	Hard	Easy	Hard
DP	98%	0%	39%	0%	34.25%
ACT	94%	25%	88%	0%	51.25%
RDT	74%	43%	72%	9%	49.5%
Pi0	96%	80%	84%	36%	74%
GO-1 Air	86%	94%	94%	33%	76.75%
GO-1	96%	96%	94%	35%	80.25%