optimization-convex-concave
OptimizationRAINrigorous codebase
Description
RAIN Convex-Concave
Research Question
Can you improve gradient-norm convergence on the exact convex-concave benchmark instances used by the official RAIN repository for src/bilinear_func/exp_gnorm.m and src/delta_func/exp_gnorm.m?
What You Can Modify
Edit only the scaffold file RAIN/optimization_convex_concave/custom_strategy.py inside the editable block containing:
init_state(problem, initial_z, seed, hyperparameters)step(state, oracle, problem, hyperparameters, max_sfo_calls)get_hyperparameters(problem_name, sigma)
The benchmark harness, problem definitions, update-noise model, official iteration counts, initializations, and metric computation are fixed.
Fixed Setup
- Problems:
bilinear: the official scalar bilinear problemf(x, y) = x ywithn = 900,tau = 0.1,z0 = [10, 10]^T,sigma = 0.001delta_nu: the official(delta, nu)problem withd = 100,delta = 1e-2,nu = 5e-5,n = 6000,tau = 1,sigma = 0.02, andz0 ~ N(0, I)under the script's fixed RNG seed
- The harness mirrors the official scripts' additive Gaussian update noise, not the earlier generalized SFO sweep variant
- Evaluation uses the official per-problem iteration counts and the same gradient-norm quantities plotted by the scripts
- Main metric:
final_gradient_norm, the mean of the two official final gradient norms
Interface Notes
init_state(...)must preserve the provided starting point instate["z"]step(...)should implement one official-style iteration of the chosen method- The oracle exposes deterministic gradients and fixed-scale Gaussian update noise so the update equations can match the MATLAB scripts directly
get_hyperparameters(...)should return the per-problem constants used by the method
Metrics
- Lower is better
- The harness prints:
STEP_METRICS problem=... iteration=... gradient_norm=...RUN_METRICS problem=... final_gradient_norm=... auc_log_iteration_log_grad=...FINAL_METRICS final_gradient_norm=...
Read-Only References
RAIN/README.mdRAIN/src/bilinear_func/exp_gnorm.mRAIN/src/delta_func/exp_gnorm.m
These are the primary references. The task now follows those scripts directly rather than the earlier MLS-Bench-specific generalized variant.
Code
custom_strategy.py
EditableRead-only
1"""Editable strategy scaffold for the optimization-convex-concave MLS-Bench task."""23from __future__ import annotations45from typing import Any67import numpy as np89from fixed_benchmark import (10ProblemSpec,11StepOutput,12StochasticOracle,13as_vector,14make_step_output,15run_cli,
Results
| Model | Type | final gradient norm default-noise ↓ | score default-noise ↑ | auc log iteration log grad default-noise ↓ | bilinear final gradient norm default-noise ↓ | delta nu final gradient norm default-noise ↓ | final gradient norm low-noise ↓ | score low-noise ↑ | auc log iteration log grad low-noise ↓ | bilinear final gradient norm low-noise ↓ | delta nu final gradient norm low-noise ↓ | final gradient norm high-noise ↓ | score high-noise ↑ | auc log iteration log grad high-noise ↓ | bilinear final gradient norm high-noise ↓ | delta nu final gradient norm high-noise ↓ |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| r_seg | baseline | 0.751 | -0.751 | -0.669 | 1.410 | 0.093 | 0.750 | -0.750 | -0.676 | 1.407 | 0.093 | 0.775 | -0.775 | -0.516 | 1.421 | 0.129 |
| rain | baseline | 0.022 | -0.022 | -0.797 | 0.021 | 0.023 | 0.009 | -0.009 | -1.087 | 0.015 | 0.002 | 0.081 | -0.081 | -0.320 | 0.047 | 0.114 |
| seag | baseline | 0.135 | -0.135 | -1.107 | 0.161 | 0.110 | 0.083 | -0.083 | -1.547 | 0.159 | 0.007 | 0.381 | -0.381 | -0.638 | 0.179 | 0.583 |
| seg | baseline | 0.182 | -0.182 | -0.347 | 0.174 | 0.190 | 0.117 | -0.117 | -0.442 | 0.162 | 0.072 | 0.582 | -0.582 | 0.076 | 0.228 | 0.937 |
| anthropic/claude-opus-4.6 | vanilla | 0.022 | -0.022 | -0.796 | 0.021 | 0.023 | 0.009 | -0.009 | -1.086 | 0.015 | 0.002 | 0.081 | -0.081 | -0.320 | 0.047 | 0.114 |
| deepseek-reasoner | vanilla | 0.182 | -0.182 | -0.347 | 0.174 | 0.190 | 0.117 | -0.117 | -0.442 | 0.162 | 0.072 | 0.582 | -0.582 | 0.076 | 0.228 | 0.937 |
| google/gemini-3.1-pro-preview | vanilla | 0.081 | -0.081 | -1.515 | 0.157 | 0.005 | 0.080 | -0.080 | -1.604 | 0.157 | 0.003 | 0.089 | -0.089 | -1.149 | 0.159 | 0.020 |
| openai/gpt-5.4-pro | vanilla | 0.024 | -0.024 | -4.306 | 0.000 | 0.047 | 0.030 | -0.030 | -4.654 | 0.000 | 0.059 | 0.042 | -0.042 | -3.934 | 0.000 | 0.084 |
| qwen3.6-plus:free | vanilla | 0.187 | -0.187 | -0.332 | 0.165 | 0.209 | 0.132 | -0.132 | -0.418 | 0.164 | 0.099 | 0.563 | -0.563 | 0.081 | 0.186 | 0.939 |
| anthropic/claude-opus-4.6 | agent | 0.017 | -0.017 | -1.194 | 0.012 | 0.022 | 0.005 | -0.005 | -1.624 | 0.008 | 0.002 | 0.073 | -0.073 | -0.554 | 0.033 | 0.114 |
| anthropic/claude-opus-4.6 | agent | 0.022 | -0.022 | -0.796 | 0.021 | 0.023 | 0.009 | -0.009 | -1.086 | 0.015 | 0.002 | 0.081 | -0.081 | -0.320 | 0.047 | 0.114 |
| deepseek-reasoner | agent | 0.022 | -0.022 | -0.820 | 0.022 | 0.022 | 0.009 | -0.009 | -1.114 | 0.016 | 0.002 | 0.078 | -0.078 | -0.341 | 0.047 | 0.109 |
| google/gemini-3.1-pro-preview | agent | 0.076 | -0.076 | -0.827 | 0.141 | 0.012 | 0.072 | -0.072 | -1.089 | 0.141 | 0.002 | 0.099 | -0.099 | -0.434 | 0.138 | 0.060 |
| google/gemini-3.1-pro-preview | agent | 0.135 | -0.135 | -1.107 | 0.161 | 0.110 | 0.083 | -0.083 | -1.547 | 0.159 | 0.007 | 0.381 | -0.381 | -0.638 | 0.179 | 0.583 |
| google/gemini-3.1-pro-preview | agent | 0.051 | -0.051 | -1.565 | 0.098 | 0.005 | 0.050 | -0.050 | -1.668 | 0.097 | 0.002 | 0.060 | -0.060 | -1.189 | 0.100 | 0.020 |
| openai/gpt-5.4-pro | agent | 0.001 | -0.001 | -6.093 | 0.000 | 0.001 | 0.000 | -0.000 | -6.468 | 0.000 | 0.001 | 0.001 | -0.001 | -5.781 | 0.000 | 0.001 |
| qwen3.6-plus:free | agent | 0.106 | -0.106 | -0.599 | 0.149 | 0.062 | 0.100 | -0.100 | -0.596 | 0.137 | 0.062 | 0.135 | -0.135 | -0.575 | 0.203 | 0.067 |