optimization-convex-concave

OptimizationRAINrigorous codebase

Description

RAIN Convex-Concave

Research Question

Can you improve gradient-norm convergence on the exact convex-concave benchmark instances used by the official RAIN repository for src/bilinear_func/exp_gnorm.m and src/delta_func/exp_gnorm.m?

What You Can Modify

Edit only the scaffold file RAIN/optimization_convex_concave/custom_strategy.py inside the editable block containing:

  1. init_state(problem, initial_z, seed, hyperparameters)
  2. step(state, oracle, problem, hyperparameters, max_sfo_calls)
  3. get_hyperparameters(problem_name, sigma)

The benchmark harness, problem definitions, update-noise model, official iteration counts, initializations, and metric computation are fixed.

Fixed Setup

  • Problems:
    • bilinear: the official scalar bilinear problem f(x, y) = x y with n = 900, tau = 0.1, z0 = [10, 10]^T, sigma = 0.001
    • delta_nu: the official (delta, nu) problem with d = 100, delta = 1e-2, nu = 5e-5, n = 6000, tau = 1, sigma = 0.02, and z0 ~ N(0, I) under the script's fixed RNG seed
  • The harness mirrors the official scripts' additive Gaussian update noise, not the earlier generalized SFO sweep variant
  • Evaluation uses the official per-problem iteration counts and the same gradient-norm quantities plotted by the scripts
  • Main metric: final_gradient_norm, the mean of the two official final gradient norms

Interface Notes

  • init_state(...) must preserve the provided starting point in state["z"]
  • step(...) should implement one official-style iteration of the chosen method
  • The oracle exposes deterministic gradients and fixed-scale Gaussian update noise so the update equations can match the MATLAB scripts directly
  • get_hyperparameters(...) should return the per-problem constants used by the method

Metrics

  • Lower is better
  • The harness prints:
    • STEP_METRICS problem=... iteration=... gradient_norm=...
    • RUN_METRICS problem=... final_gradient_norm=... auc_log_iteration_log_grad=...
    • FINAL_METRICS final_gradient_norm=...

Read-Only References

  • RAIN/README.md
  • RAIN/src/bilinear_func/exp_gnorm.m
  • RAIN/src/delta_func/exp_gnorm.m

These are the primary references. The task now follows those scripts directly rather than the earlier MLS-Bench-specific generalized variant.

Code

custom_strategy.py
EditableRead-only
1"""Editable strategy scaffold for the optimization-convex-concave MLS-Bench task."""
2
3from __future__ import annotations
4
5from typing import Any
6
7import numpy as np
8
9from fixed_benchmark import (
10 ProblemSpec,
11 StepOutput,
12 StochasticOracle,
13 as_vector,
14 make_step_output,
15 run_cli,

Results

ModelTypefinal gradient norm default-noise score default-noise auc log iteration log grad default-noise bilinear final gradient norm default-noise delta nu final gradient norm default-noise final gradient norm low-noise score low-noise auc log iteration log grad low-noise bilinear final gradient norm low-noise delta nu final gradient norm low-noise final gradient norm high-noise score high-noise auc log iteration log grad high-noise bilinear final gradient norm high-noise delta nu final gradient norm high-noise
r_segbaseline0.751-0.751-0.6691.4100.0930.750-0.750-0.6761.4070.0930.775-0.775-0.5161.4210.129
rainbaseline0.022-0.022-0.7970.0210.0230.009-0.009-1.0870.0150.0020.081-0.081-0.3200.0470.114
seagbaseline0.135-0.135-1.1070.1610.1100.083-0.083-1.5470.1590.0070.381-0.381-0.6380.1790.583
segbaseline0.182-0.182-0.3470.1740.1900.117-0.117-0.4420.1620.0720.582-0.5820.0760.2280.937
anthropic/claude-opus-4.6vanilla0.022-0.022-0.7960.0210.0230.009-0.009-1.0860.0150.0020.081-0.081-0.3200.0470.114
deepseek-reasonervanilla0.182-0.182-0.3470.1740.1900.117-0.117-0.4420.1620.0720.582-0.5820.0760.2280.937
google/gemini-3.1-pro-previewvanilla0.081-0.081-1.5150.1570.0050.080-0.080-1.6040.1570.0030.089-0.089-1.1490.1590.020
openai/gpt-5.4-provanilla0.024-0.024-4.3060.0000.0470.030-0.030-4.6540.0000.0590.042-0.042-3.9340.0000.084
qwen3.6-plus:freevanilla0.187-0.187-0.3320.1650.2090.132-0.132-0.4180.1640.0990.563-0.5630.0810.1860.939
anthropic/claude-opus-4.6agent0.017-0.017-1.1940.0120.0220.005-0.005-1.6240.0080.0020.073-0.073-0.5540.0330.114
anthropic/claude-opus-4.6agent0.022-0.022-0.7960.0210.0230.009-0.009-1.0860.0150.0020.081-0.081-0.3200.0470.114
deepseek-reasoneragent0.022-0.022-0.8200.0220.0220.009-0.009-1.1140.0160.0020.078-0.078-0.3410.0470.109
google/gemini-3.1-pro-previewagent0.076-0.076-0.8270.1410.0120.072-0.072-1.0890.1410.0020.099-0.099-0.4340.1380.060
google/gemini-3.1-pro-previewagent0.135-0.135-1.1070.1610.1100.083-0.083-1.5470.1590.0070.381-0.381-0.6380.1790.583
google/gemini-3.1-pro-previewagent0.051-0.051-1.5650.0980.0050.050-0.050-1.6680.0970.0020.060-0.060-1.1890.1000.020
openai/gpt-5.4-proagent0.001-0.001-6.0930.0000.0010.000-0.000-6.4680.0000.0010.001-0.001-5.7810.0000.001
qwen3.6-plus:freeagent0.106-0.106-0.5990.1490.0620.100-0.100-0.5960.1370.0620.135-0.135-0.5750.2030.067

Agent Conversations