robo-diffusion-policy

Othercleandiffuserrigorous codebase

Description

Robo-Diffusion: Policy Algorithm Design

Objective

Design better policy algorithms for online decision-making and action generation in robot control.

What You Can Modify

  • Policy algorithm core logic
  • Q-function design (if used)
  • Action generation strategy
  • Training objective
  • Actor-critic architecture

What Is Fixed

  • Network architecture (task-specific standard architecture)
  • Diffusion model training
  • Evaluation environments

Evaluation

Evaluated on three D4RL MuJoCo environments:

  1. hopper-medium-v2
  2. walker2d-medium-v2
  3. halfcheetah-medium-v2

Metrics: normalized_score, episode_reward, training_time

Baselines

dql

Diffusion Q-Learning - Combines diffusion policy with Q-learning

idql

Implicit Q-Learning with Diffusion - Uses implicit Q-learning

diffusion_policy

Diffusion Policy - Pure behavior cloning with diffusion

Code

Results

No results available yet.