optimization-convex-concave

OptimizationRAINrigorous codebase

Description

RAIN Convex-Concave

Research Question

Can you improve gradient-norm convergence on the exact convex-concave benchmark instances used by the official RAIN repository for src/bilinear_func/exp_gnorm.m and src/delta_func/exp_gnorm.m?

What You Can Modify

Edit only the scaffold file RAIN/optimization_convex_concave/custom_strategy.py inside the editable block containing:

init_state(problem, initial_z, seed, hyperparameters)
step(state, oracle, problem, hyperparameters, max_sfo_calls)
get_hyperparameters(problem_name, sigma)

The benchmark harness, problem definitions, update-noise model, official iteration counts, initializations, and metric computation are fixed.

Fixed Setup

Problems:
- bilinear: the official scalar bilinear problem f(x, y) = x y with n = 900, tau = 0.1, z0 = [10, 10]^T, sigma = 0.001
- delta_nu: the official (delta, nu) problem with d = 100, delta = 1e-2, nu = 5e-5, n = 6000, tau = 1, sigma = 0.02, and z0 ~ N(0, I) under the script's fixed RNG seed
The harness mirrors the official scripts' additive Gaussian update noise, not the earlier generalized SFO sweep variant
Evaluation uses the official per-problem iteration counts and the same gradient-norm quantities plotted by the scripts
Main metric: final_gradient_norm, the mean of the two official final gradient norms

Interface Notes

init_state(...) must preserve the provided starting point in state["z"]
step(...) should implement one official-style iteration of the chosen method
The oracle exposes deterministic gradients and fixed-scale Gaussian update noise so the update equations can match the MATLAB scripts directly
get_hyperparameters(...) should return the per-problem constants used by the method

Metrics

Lower is better
The harness prints:
- STEP_METRICS problem=... iteration=... gradient_norm=...
- RUN_METRICS problem=... final_gradient_norm=... auc_log_iteration_log_grad=...
- FINAL_METRICS final_gradient_norm=...

Read-Only References

RAIN/README.md
RAIN/src/bilinear_func/exp_gnorm.m
RAIN/src/delta_func/exp_gnorm.m

These are the primary references. The task now follows those scripts directly rather than the earlier MLS-Bench-specific generalized variant.

Code

custom_strategy.py

EditableRead-only

1"""Editable strategy scaffold for the optimization-convex-concave MLS-Bench task."""
2
3from __future__ import annotations
4
5from typing import Any
6
7import numpy as np
8
9from fixed_benchmark import (
10    ProblemSpec,
11    StepOutput,
12    StochasticOracle,
13    as_vector,
14    make_step_output,
15    run_cli,

Results

Show per-seed results

Model	Type	final gradient norm default-noise ↓	score default-noise ↑	auc log iteration log grad default-noise ↓	bilinear final gradient norm default-noise ↓	delta nu final gradient norm default-noise ↓	final gradient norm low-noise ↓	score low-noise ↑	auc log iteration log grad low-noise ↓	bilinear final gradient norm low-noise ↓	delta nu final gradient norm low-noise ↓	final gradient norm high-noise ↓	score high-noise ↑	auc log iteration log grad high-noise ↓	bilinear final gradient norm high-noise ↓	delta nu final gradient norm high-noise ↓
r_seg	baseline	0.751	-0.751	-0.669	1.410	0.093	0.750	-0.750	-0.676	1.407	0.093	0.775	-0.775	-0.516	1.421	0.129
rain	baseline	0.022	-0.022	-0.797	0.021	0.023	0.009	-0.009	-1.087	0.015	0.002	0.081	-0.081	-0.320	0.047	0.114
seag	baseline	0.135	-0.135	-1.107	0.161	0.110	0.083	-0.083	-1.547	0.159	0.007	0.381	-0.381	-0.638	0.179	0.583
seg	baseline	0.182	-0.182	-0.347	0.174	0.190	0.117	-0.117	-0.442	0.162	0.072	0.582	-0.582	0.076	0.228	0.937
anthropic/claude-opus-4.6	vanilla	0.022	-0.022	-0.796	0.021	0.023	0.009	-0.009	-1.086	0.015	0.002	0.081	-0.081	-0.320	0.047	0.114
deepseek-reasoner	vanilla	0.182	-0.182	-0.347	0.174	0.190	0.117	-0.117	-0.442	0.162	0.072	0.582	-0.582	0.076	0.228	0.937
google/gemini-3.1-pro-preview	vanilla	0.081	-0.081	-1.515	0.157	0.005	0.080	-0.080	-1.604	0.157	0.003	0.089	-0.089	-1.149	0.159	0.020
openai/gpt-5.4-pro	vanilla	0.024	-0.024	-4.306	0.000	0.047	0.030	-0.030	-4.654	0.000	0.059	0.042	-0.042	-3.934	0.000	0.084
qwen3.6-plus:free	vanilla	0.187	-0.187	-0.332	0.165	0.209	0.132	-0.132	-0.418	0.164	0.099	0.563	-0.563	0.081	0.186	0.939
anthropic/claude-opus-4.6	agent	0.017	-0.017	-1.194	0.012	0.022	0.005	-0.005	-1.624	0.008	0.002	0.073	-0.073	-0.554	0.033	0.114
anthropic/claude-opus-4.6	agent	0.022	-0.022	-0.796	0.021	0.023	0.009	-0.009	-1.086	0.015	0.002	0.081	-0.081	-0.320	0.047	0.114
deepseek-reasoner	agent	0.022	-0.022	-0.820	0.022	0.022	0.009	-0.009	-1.114	0.016	0.002	0.078	-0.078	-0.341	0.047	0.109
google/gemini-3.1-pro-preview	agent	0.076	-0.076	-0.827	0.141	0.012	0.072	-0.072	-1.089	0.141	0.002	0.099	-0.099	-0.434	0.138	0.060
google/gemini-3.1-pro-preview	agent	0.135	-0.135	-1.107	0.161	0.110	0.083	-0.083	-1.547	0.159	0.007	0.381	-0.381	-0.638	0.179	0.583
google/gemini-3.1-pro-preview	agent	0.051	-0.051	-1.565	0.098	0.005	0.050	-0.050	-1.668	0.097	0.002	0.060	-0.060	-1.189	0.100	0.020
openai/gpt-5.4-pro	agent	0.001	-0.001	-6.093	0.000	0.001	0.000	-0.000	-6.468	0.000	0.001	0.001	-0.001	-5.781	0.000	0.001
qwen3.6-plus:free	agent	0.106	-0.106	-0.599	0.149	0.062	0.100	-0.100	-0.596	0.137	0.062	0.135	-0.135	-0.575	0.203	0.067

Agent Conversations

anthropic/claude-opus-4.6

2 steps

deepseek-reasoner

7 steps

google/gemini-3.1-pro-preview

7 steps

openai/gpt-5.4-pro

5 steps