robo-diffusion-policy
Othercleandiffuserrigorous codebase
Description
Robo-Diffusion: Policy Algorithm Design
Objective
Design better policy algorithms for online decision-making and action generation in robot control.
What You Can Modify
- Policy algorithm core logic
- Q-function design (if used)
- Action generation strategy
- Training objective
- Actor-critic architecture
What Is Fixed
- Network architecture (task-specific standard architecture)
- Diffusion model training
- Evaluation environments
Evaluation
Evaluated on three D4RL MuJoCo environments:
- hopper-medium-v2
- walker2d-medium-v2
- halfcheetah-medium-v2
Metrics: normalized_score, episode_reward, training_time
Baselines
dql
Diffusion Q-Learning - Combines diffusion policy with Q-learning
idql
Implicit Q-Learning with Diffusion - Uses implicit Q-learning
diffusion_policy
Diffusion Policy - Pure behavior cloning with diffusion
Code
Results
No results available yet.