Agent Conversation: anthropic/claude-opus-4.6
Model: anthropic/claude-opus-4.6
Total steps: 6
Messages: 22
UserStep 0
Task: quant-graph-stock
Graph-Based Multi-Stock Prediction on CSI300
Objective
Design and implement a graph-based stock prediction model that leverages inter-stock relationships through a stock-concept graph. Your code goes in custom_model.py. Three reference implementations (HIST, GATs, LightGBM) are provided as read-only.
Evaluation
Signal quality: IC, ICIR, Rank IC. Portfolio (TopkDropout, top 50, drop 5): Annualized Return, Max Drawdown, Information Ratio. Automatic via qlib's workflow.
Workflow Configuration
workflow_config.yaml lines 14-26 and 32-45 are editable. This covers the model plus dataset adapter/preprocessor configuration. Instruments, date ranges, train/valid/test splits, and evaluation settings are fixed.
qlib/custom_model.py [EDITABLE — lines 58–156 only]
1: # Custom graph-based stock prediction model for MLS-Bench
2: #
3: # EDITABLE section: CustomModel class with fit() and predict() methods.
4: # FIXED sections: imports and stock-concept graph loading below.
5: import os
6: import numpy as np
7: import pandas as pd
8: import torch
9: import torch.nn as nn
10: import torch.nn.functional as F
11: from qlib.model.base import Model
12: from qlib.data.dataset import DatasetH
13: from qlib.data.dataset.handler import DataHandlerLP
14:
15: DEVICE = "cuda" if torch.cuda.is_available() else "cpu"
16:
17: # =====================================================================
18: # FIXED: Stock-concept graph data loading utilities
19: # =====================================================================
20: # Paths to pre-downloaded graph data
21: STOCK2CONCEPT_PATH = os.path.expanduser("~/.qlib/qlib_data/qlib_csi300_stock2concept.npy")
22: STOCK_INDEX_PATH = os.path.expanduser("~/.qlib/qlib_data/qlib_csi300_stock_index.npy")
23:
24: # Load the stock-concept mapping matrix and stock index
25: # stock2concept_matrix: shape (num_stocks, num_concepts), binary membership
26: # stock_index_dict: dict mapping instrument name -> integer index
27: _stock2concept_matrix = np.load(STOCK2CONCEPT_PATH)
28: _stock_index_dict = np.load(STOCK_INDEX_PATH, allow_pickle=True).item()
29:
30:
31: def get_stock_index(instruments, default_index=733):
32: """Map instrument names to integer indices for stock2concept lookup.
33:
34: Args:
35: instruments: array-like of instrument name strings
36: default_index: fallback index for unknown instruments (733 = padding)
37:
38: Returns:
39: np.ndarray of integer indices
40: """
41: indices = np.array([_stock_index_dict.get(inst, default_index)
42: for inst in instruments])
43: return indices.astype(int)
44:
45:
46: def get_concept_matrix(stock_indices):
47: """Get the concept membership matrix for giAssistantStep 1edit
Tool Input
1{
2 "op": "replace",
3 "filename": "qlib/custom_model.py",
4 "start_line": 58,
5 "end_line": 156,
6 "content": "import copy\nimport torch.optim as optim\n\nclass GRUEnc(nn.Module):\n def __init__(self, d=6, H=64, L=2, dr=0.0):\n super().__init__()\n self.rnn = nn.GRU(d, H, L, batch_first=True, dropout=dr)\n self.ln = nn.LayerNorm(H); self.d = d\n def forward(self, x):\n x = x.reshape(len(x), self.d, -1).permute(0, 2, 1)\n return self.ln(self.rnn(x)[0][:, -1, :])\n\nclass ConceptRankNet(nn.Module):\n \"\"\"Three-channel decomposition: explicit concepts + implicit clusters + individual.\"\"\"\n def __init__(self, d=6, H=64, L=2, dr=0.0):\n super().__init__()\n self.enc = GRUEnc(d, H, L, dr); self.H = H\n self.Wq = nn.Linear(H,H); self.Wk = nn.Linear(H,H)\n self.Wv = nn.Linear(H,H); self.Wi = nn.Linear(H,H)\n self.fc_e = nn.Linear(H,1); self.fc_i = nn.Linear(H,1)\n self.fc_r = nn.Linear(H,1); self.drop = nn.Dropout(dr)\n for m in [self.Wq,self.Wk,self.Wv,self.Wi]: nn.init.xavier_uniform_(m.weight)\n def forward(self, x, cm):\n h = self.enc(x)\n # Explicit: concept prototypes via membership-weighted avg, then cross-attention\n cn = cm / cm.sum(0, keepdim=True).clamp(min=1)\n protos = cn.t() @ h\n Q,K,V = self.Wq(h), self.Wk(protos), self.Wv(protos)\n att = (Q @ K.t()) / (self.H**0.5)\n att = att.masked_fill(cm==0, float('-inf'))\n hc = (cm.sum(1) > 0); aw = torch.zeros_like(att)\n if hc.any(): aw[hc] = F.softmax(att[hc], dim=1)\n he = self.drop(aw @ V); hr = h - he\n # Implicit: stock-stock cosine attention in residual space\n hi = torch.zeros_like(hr)\n if h.shape[0] > 1:\n hp = F.normalize(self.Wi(hr), dim=1); sim = hp @ hp.t()\n sim.fill_diagonal_(float('-inf'))\n hi = self.drop(F.softmax(sim * 10, dim=1) @ hr)\n hind = hr - hi\n pred = (self.fc_e(he)+self.fc_i(hi)+self.fc_r(hind)).squeeze(-1)\n orth = (he*hi).mean()**2 + (hi*hind).mean()**2\n return pred, orth\n\nclass CustomModel(Model):\n def __init__(self):\n super().__init__()\n self.fitted = False; self.net = ConceptRankNet(d=6,H=64,L=2,dr=0.0).to(DEVICE)\n self.n_ep=50; self.lr=5e-4; self.alpha=0.1; self.beta=0.01\n @staticmethod\n def _rl(p, y):\n \"\"\"Differentiable pairwise rank loss approximating negative rank IC.\"\"\"\n dp=p.unsqueeze(1)-p.unsqueeze(0); dy=y.unsqueeze(1)-y.unsqueeze(0)\n return -torch.mean(torch.tanh(dp*5)*torch.sign(dy))\n def _days(self, df):\n dc=df.groupby(level=0).size().values; di=np.roll(np.cumsum(dc),1); di[0]=0\n return di,dc,df[\"feature\"].values,df[\"label\"].values.ravel(),df[\"feature\"].index.get_level_values(\"instrument\")\n def fit(self, dataset: DatasetH):\n dft=dataset.prepare(\"train\",col_set=[\"feature\",\"label\"],data_key=DataHandlerLP.DK_L)\n dfv=dataset.prepare(\"valid\",col_set=[\"feature\",\"label\"],data_key=DataHandlerLP.DK_L)\n opt=optim.Adam(self.net.parameters(),lr=self.lr,weight_decay=1e-5)\n best_ic,best_st,wait=-1e9,None,0\n for ep in range(self.n_ep):\n self.net.train(); di,dc,ft,lb,ins=self._days(dft)\n perm=np.random.permutation(len(dc))\n for idx in perm:\n si,c=int(di[idx]),int(dc[idx]); sl=slice(si,si+c)\n f,l=ft[sl],lb[sl]; mk=~(np.isnan(f).any(1)|np.isnan(l))\n if mk.sum()<5: continue\n it=ins[sl][mk]; cm=torch.tensor(get_concept_matrix(get_stock_index(it)),device=DEVICE)\n x=torch.tensor(f[mk],dtype=torch.float32,device=DEVICE)\n y=torch.tensor(l[mk],dtype=torch.float32,device=DEVICE)\n pred,orth=self.net(x,cm)\n loss=F.mse_loss(pred,y)+self.alpha*self._rl(pred,y)+self.beta*orth\n opt.zero_grad(); loss.backward()\n torch.nn.utils.clip_grad_norm_(self.net.parameters(),1.0); opt.step()\n self.net.eval(); ps,ls=[],[]; di,dc,ft,lb,ins=self._days(dfv)\n with torch.no_grad():\n for si,c in zip(di,dc):\n sl=slice(si,si+c); f,l=ft[sl],lb[sl]; mk=~(np.isnan(f).any(1)|np.isnan(l))\n if mk.sum()<2: continue\n it=ins[sl][mk]; cm=torch.tensor(get_concept_matrix(get_stock_index(it)),device=DEVICE)\n x=torch.tensor(f[mk],dtype=torch.float32,device=DEVICE)\n ps.append(self.net(x,cm)[0].cpu().numpy()); ls.append(l[mk])\n ic=np.corrcoef(np.concatenate(ps),np.concatenate(ls))[0,1]\n if ic>best_ic: best_ic,best_st,wait=ic,copy.deepcopy(self.net.state_dict()),0\n else: wait+=1\n if wait>=10: break\n if best_st: self.net.load_state_dict(best_st)\n self.fitted=True\n def predict(self, dataset: DatasetH, segment=\"test\"):\n if not self.fitted: raise ValueError(\"Not fitted\")\n df=dataset.prepare(segment,col_set=[\"feature\",\"label\"],data_key=DataHandlerLP.DK_I)\n self.net.eval(); preds=[]; dc=df.groupby(level=0).size().values\n di=np.roll(np.cumsum(dc),1); di[0]=0\n ft=df[\"feature\"].values; ins=df[\"feature\"].index.get_level_values(\"instrument\")\n with torch.no_grad():\n for si,c in zip(di,dc):\n sl=slice(si,si+c); it=ins[sl]; cm=torch.tensor(get_concept_matrix(get_stock_index(it)),device=DEVICE)\n preds.append(self.net(torch.tensor(np.nan_to_num(ft[sl],nan=0.0),dtype=torch.float32,device=DEVICE),cm)[0].cpu().numpy())\n return pd.Series(np.concatenate(preds),index=df[\"feature\"].index,name=\"score\")"
7}Tool ResultStep 1
OK: Replaced lines 58..156 with 101 line(s) in qlib/custom_model.py. Editable range: 58–158.
[Current file: qlib/custom_model.py | editable: 58–158 | total: 158 lines]
55| return _stock2concept_matrix[stock_indices].astype(np.float32)
56|
57|
58 import copy
59 import torch.optim as optim
60
61 class GRUEnc(nn.Module):
62 def __init__(self, d=6, H=64, L=2, dr=0.0):
63 super().__init__()
64 self.rnn = nn.GRU(d, H, L, batch_first=True, dropout=dr)
65 self.ln = nn.LayerNorm(H); self.d = d
66 def forward(self, x):
67 x = x.reshape(len(x), self.d, -1).permute(0, 2, 1)
68 return self.ln(self.rnn(x)[0][:, -1, :])
69
70 class ConceptRankNet(nn.Module):
71 """Three-channel decomposition: explicit concepts + implicit clusters + individual."""
72 def __init__(self, d=6, H=64, L=2, dr=0.0):
73 super().__init__()
74 self.enc
AssistantStep 2test
Tool ResultStep 2
[Test #1] (2 tests remaining; call submit(n=1) to submit this result)
### csi300 (scripts/csi300.sh)
Signal analysis (csi300):
SIGNAL_METRIC IC=0.043513
SIGNAL_METRIC ICIR=0.296567
SIGNAL_METRIC Rank_IC=0.059742
SIGNAL_METRIC Rank_ICIR=0.418778
Portfolio analysis (csi300):
PORTFOLIO_METRIC annualized_return=0.044671
PORTFOLIO_METRIC max_drawdown=-0.091236
PORTFOLIO_METRIC information_ratio=0.585033
### csi100 (scripts/csi100.sh)
Signal analysis (csi100):
SIGNAL_METRIC IC=0.043679
SIGNAL_METRIC ICIR=0.222914
SIGNAL_METRIC Rank_IC=0.057033
SIGNAL_METRIC Rank_ICIR=0.300553
Portfolio analysis (csi100):
PORTFOLIO_METRIC annualized_return=-0.013817
PORTFOLIO_METRIC max_drawdown=-0.196462
PORTFOLIO_METRIC information_ratio=-0.228593
[Leaderboard] Results saved: {'ic_csi300': 0.043513, 'icir_csi300': 0.296567, 'rank_ic_csi300': 0.059742, 'rank_icir_csi300': 0.418778, 'annualized_return_csi300': 0.044671, 'max_drawdown_csi300': -0.091236, 'information_ratio_csi300': 0.585033, 'elapsed_csi300
AssistantStep 3edit
Tool Input
1{
2 "op": "replace",
3 "filename": "qlib/custom_model.py",
4 "start_line": 58,
5 "end_line": 158,
6 "content": "import copy\nimport torch.optim as optim\n\nclass GRUEnc(nn.Module):\n def __init__(self, d=6, H=64, L=2, dr=0.0):\n super().__init__()\n self.rnn = nn.GRU(d, H, L, batch_first=True, dropout=dr)\n self.ln = nn.LayerNorm(H); self.d = d\n def forward(self, x):\n x = x.reshape(len(x), self.d, -1).permute(0, 2, 1)\n return self.ln(self.rnn(x)[0][:, -1, :])\n\nclass ConceptRankNet(nn.Module):\n \"\"\"Three-channel decomposition: explicit concepts + implicit clusters + individual.\n Novel: SiLU gating before output heads for smoother gradient flow.\"\"\"\n def __init__(self, d=6, H=64, L=2, dr=0.0):\n super().__init__()\n self.enc = GRUEnc(d, H, L, dr); self.H = H\n self.Wq = nn.Linear(H,H); self.Wk = nn.Linear(H,H)\n self.Wv = nn.Linear(H,H); self.Wi = nn.Linear(H,H)\n self.fc_e = nn.Linear(H,1); self.fc_i = nn.Linear(H,1)\n self.fc_r = nn.Linear(H,1); self.drop = nn.Dropout(dr)\n for m in [self.Wq,self.Wk,self.Wv,self.Wi]: nn.init.xavier_uniform_(m.weight)\n def forward(self, x, cm):\n h = self.enc(x)\n # Explicit: concept prototypes via membership-weighted avg, then cross-attention\n cn = cm / cm.sum(0, keepdim=True).clamp(min=1)\n protos = cn.t() @ h\n Q,K,V = self.Wq(h), self.Wk(protos), self.Wv(protos)\n att = (Q @ K.t()) / (self.H**0.5)\n att = att.masked_fill(cm==0, float('-inf'))\n hc = (cm.sum(1) > 0); aw = torch.zeros_like(att)\n if hc.any(): aw[hc] = F.softmax(att[hc], dim=1)\n he = self.drop(aw @ V); hr = h - he\n # Implicit: stock-stock cosine attention in residual space\n hi = torch.zeros_like(hr)\n if h.shape[0] > 1:\n hp = F.normalize(self.Wi(hr), dim=1); sim = hp @ hp.t()\n sim.fill_diagonal_(float('-inf'))\n hi = self.drop(F.softmax(sim * 5, dim=1) @ hr)\n hind = hr - hi\n pred = (self.fc_e(F.silu(he))+self.fc_i(F.silu(hi))+self.fc_r(F.silu(hind))).squeeze(-1)\n return pred\n\nclass CustomModel(Model):\n \"\"\"Concept-Residual Decomposition with Concept-Conditioned Rank Loss (CCRL).\n Novel: pairwise rank loss weighted by concept co-membership so that\n stocks sharing more concepts contribute more to the ranking objective.\"\"\"\n def __init__(self):\n super().__init__()\n self.fitted = False; self.net = ConceptRankNet(d=6,H=64,L=2,dr=0.0).to(DEVICE)\n self.n_ep=50; self.lr=5e-4; self.alpha=0.15\n @staticmethod\n def _ccrl(p, y, cm):\n \"\"\"Concept-Conditioned Rank Loss: pairwise rank weighted by concept co-membership.\"\"\"\n dp = p.unsqueeze(1)-p.unsqueeze(0); dy = y.unsqueeze(1)-y.unsqueeze(0)\n pairwise = torch.tanh(dp*3)*torch.sign(dy)\n w = cm @ cm.t(); w = w / (w.sum(1,keepdim=True)+1e-8)\n return -(pairwise * w).mean()\n def _days(self, df):\n dc=df.groupby(level=0).size().values; di=np.roll(np.cumsum(dc),1); di[0]=0\n return di,dc,df[\"feature\"].values,df[\"label\"].values.ravel(),df[\"feature\"].index.get_level_values(\"instrument\")\n def fit(self, dataset: DatasetH):\n dft=dataset.prepare(\"train\",col_set=[\"feature\",\"label\"],data_key=DataHandlerLP.DK_L)\n dfv=dataset.prepare(\"valid\",col_set=[\"feature\",\"label\"],data_key=DataHandlerLP.DK_L)\n opt=optim.Adam(self.net.parameters(),lr=self.lr,weight_decay=1e-5)\n best_ic,best_st,wait=-1e9,None,0\n for ep in range(self.n_ep):\n self.net.train(); di,dc,ft,lb,ins=self._days(dft)\n perm=np.random.permutation(len(dc))\n for idx in perm:\n si,c=int(di[idx]),int(dc[idx]); sl=slice(si,si+c)\n f,l=ft[sl],lb[sl]; mk=~(np.isnan(f).any(1)|np.isnan(l))\n if mk.sum()<5: continue\n it=ins[sl][mk]; cm=torch.tensor(get_concept_matrix(get_stock_index(it)),device=DEVICE)\n x=torch.tensor(f[mk],dtype=torch.float32,device=DEVICE)\n y=torch.tensor(l[mk],dtype=torch.float32,device=DEVICE)\n pred=self.net(x,cm)\n loss=F.mse_loss(pred,y)+self.alpha*self._ccrl(pred,y,cm)\n opt.zero_grad(); loss.backward()\n torch.nn.utils.clip_grad_norm_(self.net.parameters(),1.0); opt.step()\n self.net.eval(); ps,ls=[],[]; di,dc,ft,lb,ins=self._days(dfv)\n with torch.no_grad():\n for si,c in zip(di,dc):\n sl=slice(si,si+c); f,l=ft[sl],lb[sl]; mk=~(np.isnan(f).any(1)|np.isnan(l))\n if mk.sum()<2: continue\n it=ins[sl][mk]; cm=torch.tensor(get_concept_matrix(get_stock_index(it)),device=DEVICE)\n x=torch.tensor(f[mk],dtype=torch.float32,device=DEVICE)\n ps.append(self.net(x,cm).cpu().numpy()); ls.append(l[mk])\n ic=np.corrcoef(np.concatenate(ps),np.concatenate(ls))[0,1]\n if ic>best_ic: best_ic,best_st,wait=ic,copy.deepcopy(self.net.state_dict()),0\n else: wait+=1\n if wait>=10: break\n if best_st: self.net.load_state_dict(best_st)\n self.fitted=True\n def predict(self, dataset: DatasetH, segment=\"test\"):\n if not self.fitted: raise ValueError(\"Not fitted\")\n df=dataset.prepare(segment,col_set=[\"feature\",\"label\"],data_key=DataHandlerLP.DK_I)\n self.net.eval(); preds=[]; dc=df.groupby(level=0).size().values\n di=np.roll(np.cumsum(dc),1); di[0]=0\n ft=df[\"feature\"].values; ins=df[\"feature\"].index.get_level_values(\"instrument\")\n with torch.no_grad():\n for si,c in zip(di,dc):\n sl=slice(si,si+c); it=ins[sl]; cm=torch.tensor(get_concept_matrix(get_stock_index(it)),device=DEVICE)\n preds.append(self.net(torch.tensor(np.nan_to_num(ft[sl],nan=0.0),dtype=torch.float32,device=DEVICE),cm).cpu().numpy())\n return pd.Series(np.concatenate(preds),index=df[\"feature\"].index,name=\"score\")"
7}Tool ResultStep 3
OK: Replaced lines 58..158 with 106 line(s) in qlib/custom_model.py. Editable range: 58–163.
[Current file: qlib/custom_model.py | editable: 58–163 | total: 163 lines]
55| return _stock2concept_matrix[stock_indices].astype(np.float32)
56|
57|
58 import copy
59 import torch.optim as optim
60
61 class GRUEnc(nn.Module):
62 def __init__(self, d=6, H=64, L=2, dr=0.0):
63 super().__init__()
64 self.rnn = nn.GRU(d, H, L, batch_first=True, dropout=dr)
65 self.ln = nn.LayerNorm(H); self.d = d
66 def forward(self, x):
67 x = x.reshape(len(x), self.d, -1).permute(0, 2, 1)
68 return self.ln(self.rnn(x)[0][:, -1, :])
69
70 class ConceptRankNet(nn.Module):
71 """Three-channel decomposition: explicit concepts + implicit clusters + individual.
72 Novel: SiLU gating before output heads for smoother gradient flow."""
73 def __init__(self, d=6,
AssistantStep 4test
Tool ResultStep 4
[Test #2] (1 test remaining; call submit(n=2) to submit this result)
### csi300 (scripts/csi300.sh)
Signal analysis (csi300):
SIGNAL_METRIC IC=0.045688
SIGNAL_METRIC ICIR=0.319126
SIGNAL_METRIC Rank_IC=0.060312
SIGNAL_METRIC Rank_ICIR=0.421226
Portfolio analysis (csi300):
PORTFOLIO_METRIC annualized_return=0.063644
PORTFOLIO_METRIC max_drawdown=-0.075841
PORTFOLIO_METRIC information_ratio=0.850570
### csi100 (scripts/csi100.sh)
Signal analysis (csi100):
SIGNAL_METRIC IC=0.039548
SIGNAL_METRIC ICIR=0.196944
SIGNAL_METRIC Rank_IC=0.054978
SIGNAL_METRIC Rank_ICIR=0.282020
Portfolio analysis (csi100):
PORTFOLIO_METRIC annualized_return=-0.034753
PORTFOLIO_METRIC max_drawdown=-0.197888
PORTFOLIO_METRIC information_ratio=-0.592766
[Leaderboard] Results saved: {'ic_csi300': 0.045688, 'icir_csi300': 0.319126, 'rank_ic_csi300': 0.060312, 'rank_icir_csi300': 0.421226, 'annualized_return_csi300': 0.063644, 'max_drawdown_csi300': -0.075841, 'information_ratio_csi300': 0.85057, 'elapsed_csi300':
AssistantStep 5edit
Tool Input
1{
2 "op": "replace",
3 "filename": "qlib/custom_model.py",
4 "start_line": 58,
5 "end_line": 163,
6 "content": "import copy\nimport torch.optim as optim\n\nclass GRUEnc(nn.Module):\n def __init__(self, d=6, H=64, L=2, dr=0.0):\n super().__init__()\n self.rnn = nn.GRU(d, H, L, batch_first=True, dropout=dr)\n self.ln = nn.LayerNorm(H); self.d = d\n def forward(self, x):\n x = x.reshape(len(x), self.d, -1).permute(0, 2, 1)\n return self.ln(self.rnn(x)[0][:, -1, :])\n\nclass ConceptRankNet(nn.Module):\n \"\"\"Three-channel decomposition: explicit concepts + implicit clusters + individual.\"\"\"\n def __init__(self, d=6, H=64, L=2, dr=0.0):\n super().__init__()\n self.enc = GRUEnc(d, H, L, dr); self.H = H\n self.Wq = nn.Linear(H,H); self.Wk = nn.Linear(H,H)\n self.Wv = nn.Linear(H,H); self.Wi = nn.Linear(H,H)\n self.fc_e = nn.Linear(H,1); self.fc_i = nn.Linear(H,1)\n self.fc_r = nn.Linear(H,1); self.drop = nn.Dropout(dr)\n for m in [self.Wq,self.Wk,self.Wv,self.Wi]: nn.init.xavier_uniform_(m.weight)\n def forward(self, x, cm):\n h = self.enc(x)\n # Explicit: concept prototypes via membership-weighted avg, then cross-attention\n cn = cm / cm.sum(0, keepdim=True).clamp(min=1)\n protos = cn.t() @ h\n Q,K,V = self.Wq(h), self.Wk(protos), self.Wv(protos)\n att = (Q @ K.t()) / (self.H**0.5)\n att = att.masked_fill(cm==0, float('-inf'))\n hc = (cm.sum(1) > 0); aw = torch.zeros_like(att)\n if hc.any(): aw[hc] = F.softmax(att[hc], dim=1)\n he = self.drop(aw @ V); hr = h - he\n # Implicit: stock-stock cosine attention in residual space\n hi = torch.zeros_like(hr)\n if h.shape[0] > 1:\n hp = F.normalize(self.Wi(hr), dim=1); sim = hp @ hp.t()\n sim.fill_diagonal_(float('-inf'))\n hi = self.drop(F.softmax(sim * 10, dim=1) @ hr)\n hind = hr - hi\n pred = (self.fc_e(he)+self.fc_i(hi)+self.fc_r(hind)).squeeze(-1)\n orth = (he*hi).mean()**2 + (hi*hind).mean()**2\n return pred, orth\n\nclass CustomModel(Model):\n \"\"\"Concept-Residual Decomposition with Concept-Modulated Pairwise Rank Loss.\n Novel: rank loss pairs weighted by concept co-membership, interpolating\n between global and local ranking for robust generalization.\"\"\"\n def __init__(self):\n super().__init__()\n self.fitted = False; self.net = ConceptRankNet(d=6,H=64,L=2,dr=0.0).to(DEVICE)\n self.n_ep=50; self.lr=5e-4; self.alpha=0.1; self.beta=0.01\n @staticmethod\n def _rl(p, y, cm):\n \"\"\"Concept-Modulated Pairwise Rank Loss (CMPRL): 70% global + 30% concept.\"\"\"\n dp=p.unsqueeze(1)-p.unsqueeze(0); dy=y.unsqueeze(1)-y.unsqueeze(0)\n pw=torch.tanh(dp*5)*torch.sign(dy)\n wc=cm@cm.t(); wc=wc/(wc.sum(1,keepdim=True)+1e-8)\n n=p.shape[0]; wu=torch.ones(n,n,device=p.device)/n\n return -(pw*(0.3*wc+0.7*wu)).mean()\n def _days(self, df):\n dc=df.groupby(level=0).size().values; di=np.roll(np.cumsum(dc),1); di[0]=0\n return di,dc,df[\"feature\"].values,df[\"label\"].values.ravel(),df[\"feature\"].index.get_level_values(\"instrument\")\n def fit(self, dataset: DatasetH):\n dft=dataset.prepare(\"train\",col_set=[\"feature\",\"label\"],data_key=DataHandlerLP.DK_L)\n dfv=dataset.prepare(\"valid\",col_set=[\"feature\",\"label\"],data_key=DataHandlerLP.DK_L)\n opt=optim.Adam(self.net.parameters(),lr=self.lr,weight_decay=1e-5)\n best_ic,best_st,wait=-1e9,None,0\n for ep in range(self.n_ep):\n self.net.train(); di,dc,ft,lb,ins=self._days(dft)\n perm=np.random.permutation(len(dc))\n for idx in perm:\n si,c=int(di[idx]),int(dc[idx]); sl=slice(si,si+c)\n f,l=ft[sl],lb[sl]; mk=~(np.isnan(f).any(1)|np.isnan(l))\n if mk.sum()<5: continue\n it=ins[sl][mk]; cm=torch.tensor(get_concept_matrix(get_stock_index(it)),device=DEVICE)\n x=torch.tensor(f[mk],dtype=torch.float32,device=DEVICE)\n y=torch.tensor(l[mk],dtype=torch.float32,device=DEVICE)\n pred,orth=self.net(x,cm)\n loss=F.mse_loss(pred,y)+self.alpha*self._rl(pred,y,cm)+self.beta*orth\n opt.zero_grad(); loss.backward()\n torch.nn.utils.clip_grad_norm_(self.net.parameters(),1.0); opt.step()\n self.net.eval(); ps,ls=[],[]; di,dc,ft,lb,ins=self._days(dfv)\n with torch.no_grad():\n for si,c in zip(di,dc):\n sl=slice(si,si+c); f,l=ft[sl],lb[sl]; mk=~(np.isnan(f).any(1)|np.isnan(l))\n if mk.sum()<2: continue\n it=ins[sl][mk]; cm=torch.tensor(get_concept_matrix(get_stock_index(it)),device=DEVICE)\n x=torch.tensor(f[mk],dtype=torch.float32,device=DEVICE)\n ps.append(self.net(x,cm)[0].cpu().numpy()); ls.append(l[mk])\n ic=np.corrcoef(np.concatenate(ps),np.concatenate(ls))[0,1]\n if ic>best_ic: best_ic,best_st,wait=ic,copy.deepcopy(self.net.state_dict()),0\n else: wait+=1\n if wait>=10: break\n if best_st: self.net.load_state_dict(best_st)\n self.fitted=True\n def predict(self, dataset: DatasetH, segment=\"test\"):\n if not self.fitted: raise ValueError(\"Not fitted\")\n df=dataset.prepare(segment,col_set=[\"feature\",\"label\"],data_key=DataHandlerLP.DK_I)\n self.net.eval(); preds=[]; dc=df.groupby(level=0).size().values\n di=np.roll(np.cumsum(dc),1); di[0]=0\n ft=df[\"feature\"].values; ins=df[\"feature\"].index.get_level_values(\"instrument\")\n with torch.no_grad():\n for si,c in zip(di,dc):\n sl=slice(si,si+c); it=ins[sl]; cm=torch.tensor(get_concept_matrix(get_stock_index(it)),device=DEVICE)\n preds.append(self.net(torch.tensor(np.nan_to_num(ft[sl],nan=0.0),dtype=torch.float32,device=DEVICE),cm)[0].cpu().numpy())\n return pd.Series(np.concatenate(preds),index=df[\"feature\"].index,name=\"score\")"
7}Tool ResultStep 5
OK: Replaced lines 58..163 with 107 line(s) in qlib/custom_model.py. Editable range: 58–164.
[Current file: qlib/custom_model.py | editable: 58–164 | total: 164 lines]
55| return _stock2concept_matrix[stock_indices].astype(np.float32)
56|
57|
58 import copy
59 import torch.optim as optim
60
61 class GRUEnc(nn.Module):
62 def __init__(self, d=6, H=64, L=2, dr=0.0):
63 super().__init__()
64 self.rnn = nn.GRU(d, H, L, batch_first=True, dropout=dr)
65 self.ln = nn.LayerNorm(H); self.d = d
66 def forward(self, x):
67 x = x.reshape(len(x), self.d, -1).permute(0, 2, 1)
68 return self.ln(self.rnn(x)[0][:, -1, :])
69
70 class ConceptRankNet(nn.Module):
71 """Three-channel decomposition: explicit concepts + implicit clusters + individual."""
72 def __init__(self, d=6, H=64, L=2, dr=0.0):
73 super().__init__()
74 self.enc
AssistantStep 6test
AssistantStep 1edit
Tool Input
1{
2 "op": "replace",
3 "filename": "qlib/custom_model.py",
4 "start_line": 58,
5 "end_line": 156,
6 "content": "# Concept-Guided Attention Network (CGAN) with orthogonal decomposition and IC loss\nimport copy\nimport torch.optim as optim\nclass GRUEnc(nn.Module):\n def __init__(self, d=6, h=64, nl=2, dr=0.0):\n super().__init__()\n self.rnn = nn.GRU(d, h, nl, batch_first=True, dropout=dr); self.d = d\n def forward(self, x):\n x = x.reshape(len(x), self.d, -1).permute(0, 2, 1)\n return self.rnn(x)[0][:, -1, :]\nclass CGAN(nn.Module):\n \"\"\"Concept-Guided Attention Net: concept-biased attention + orthogonal decomposition.\"\"\"\n def __init__(self, d=6, h=64, nl=2, dr=0.0):\n super().__init__()\n self.enc = GRUEnc(d, h, nl, dr); self.h = h\n self.Wq = nn.Linear(h, h); self.Wk = nn.Linear(h, h); self.Wv = nn.Linear(h, h)\n self.cb = nn.Parameter(torch.ones(1))\n self.s2c = nn.Linear(h, h); self.c2s = nn.Linear(h, h)\n self.fc_ind = nn.Linear(h, h); self.gate = nn.Linear(h * 3, h)\n self.out = nn.Linear(h, 1); self.ln = nn.LayerNorm(h)\n for m in [self.Wq, self.Wk, self.Wv, self.s2c, self.c2s, self.fc_ind, self.gate, self.out]:\n nn.init.xavier_uniform_(m.weight); nn.init.zeros_(m.bias)\n def forward(self, x, cm):\n h = self.enc(x)\n csim = F.normalize(cm, dim=1) @ F.normalize(cm, dim=1).t()\n attn = F.softmax(self.Wq(h) @ self.Wk(h).t() / self.h**0.5 + self.cb * csim, dim=1)\n h_a = attn @ self.Wv(h)\n cw = cm / (cm.sum(0, keepdim=True) + 1e-8)\n h_c = F.relu(self.c2s(cm @ F.relu(self.s2c(cw.t() @ h))))\n h_i = F.relu(self.fc_ind(h - h_c))\n g = torch.sigmoid(self.gate(torch.cat([h_i, h_c, h_a], 1)))\n return self.out(self.ln(g * h_i + (1 - g) * (h_c + h_a) * 0.5)).squeeze(-1), h_i, h_c\nclass CustomModel(Model):\n def __init__(self):\n super().__init__()\n self.net = CGAN().to(DEVICE); self.fitted = False\n def _iter(self, df):\n feat, lab = df[\"feature\"], df[\"label\"].values.ravel()\n dc = df.groupby(level=0).size().values\n di = np.roll(np.cumsum(dc), 1); di[0] = 0\n for i, c in zip(di, dc):\n s = slice(i, i + c)\n f = torch.nan_to_num(torch.tensor(feat.values[s], dtype=torch.float32, device=DEVICE))\n l = torch.tensor(lab[s], dtype=torch.float32, device=DEVICE)\n inst = feat.index.get_level_values(\"instrument\")[s]\n cm = torch.tensor(get_concept_matrix(get_stock_index(inst)), device=DEVICE)\n m = ~torch.isnan(l)\n if m.sum() < 3: continue\n yield f, l, cm, m\n def _ic(self, p, l):\n pp, ll = p - p.mean(), l - l.mean()\n return (pp * ll).sum() / (pp.norm() * ll.norm() + 1e-8)\n def fit(self, dataset):\n df_tr = dataset.prepare(\"train\", col_set=[\"feature\", \"label\"], data_key=DataHandlerLP.DK_L)\n df_va = dataset.prepare(\"valid\", col_set=[\"feature\", \"label\"], data_key=DataHandlerLP.DK_L)\n opt = optim.Adam(self.net.parameters(), lr=5e-4, weight_decay=1e-5)\n best_ic, best_sd, wait = -1e9, None, 0\n for ep in range(100):\n self.net.train()\n for f, l, cm, m in self._iter(df_tr):\n p, hi, hc = self.net(f, cm)\n loss = F.mse_loss(p[m], l[m])\n loss = loss + 0.1 * (F.normalize(hi[m], dim=1) * F.normalize(hc[m], dim=1)).sum(1).pow(2).mean()\n if m.sum() > 10: loss = loss - 0.5 * self._ic(p[m], l[m])\n opt.zero_grad(); loss.backward()\n nn.utils.clip_grad_norm_(self.net.parameters(), 1.0); opt.step()\n self.net.eval(); ics = []\n with torch.no_grad():\n for f, l, cm, m in self._iter(df_va):\n p = self.net(f, cm)[0]\n if m.sum() > 10:\n v = self._ic(p[m], l[m])\n if not torch.isnan(v): ics.append(v.item())\n vic = np.mean(ics) if ics else -1\n if vic > best_ic:\n best_ic, best_sd, wait = vic, copy.deepcopy(self.net.state_dict()), 0\n else:\n wait += 1\n if wait >= 15: break\n if best_sd: self.net.load_state_dict(best_sd)\n self.fitted = True\n def predict(self, dataset, segment=\"test\"):\n if not self.fitted: raise ValueError(\"Not fitted\")\n df = dataset.prepare(segment, col_set=[\"feature\", \"label\"], data_key=DataHandlerLP.DK_I)\n feat = df[\"feature\"]; self.net.eval(); preds = []\n dc = df.groupby(level=0).size().values\n di = np.roll(np.cumsum(dc), 1); di[0] = 0\n with torch.no_grad():\n for i, c in zip(di, dc):\n s = slice(i, i + c)\n f = torch.nan_to_num(torch.tensor(feat.values[s], dtype=torch.float32, device=DEVICE))\n inst = feat.index.get_level_values(\"instrument\")[s]\n cm = torch.tensor(get_concept_matrix(get_stock_index(inst)), device=DEVICE)\n preds.append(self.net(f, cm)[0].cpu().numpy())\n return pd.Series(np.concatenate(preds), index=feat.index, name=\"score\")\n"
7}Tool ResultStep 1
OK: Replaced lines 58..156 with 95 line(s) in qlib/custom_model.py. Editable range: 58–152.
[Current file: qlib/custom_model.py | editable: 58–152 | total: 152 lines]
55| return _stock2concept_matrix[stock_indices].astype(np.float32)
56|
57|
58 # Concept-Guided Attention Network (CGAN) with orthogonal decomposition and IC loss
59 import copy
60 import torch.optim as optim
61 class GRUEnc(nn.Module):
62 def __init__(self, d=6, h=64, nl=2, dr=0.0):
63 super().__init__()
64 self.rnn = nn.GRU(d, h, nl, batch_first=True, dropout=dr); self.d = d
65 def forward(self, x):
66 x = x.reshape(len(x), self.d, -1).permute(0, 2, 1)
67 return self.rnn(x)[0][:, -1, :]
68 class CGAN(nn.Module):
69 """Concept-Guided Attention Net: concept-biased attention + orthogonal decomposition."""
70 def __init__(self, d=6, h=64, nl=2, dr=0.0):
71 super().__init__()
72
AssistantStep 2test
Tool ResultStep 2
[Test #1] (2 tests remaining; call submit(n=1) to submit this result)
### csi300 (scripts/csi300.sh)
Signal analysis (csi300):
SIGNAL_METRIC IC=0.045467
SIGNAL_METRIC ICIR=0.348241
SIGNAL_METRIC Rank_IC=0.057171
SIGNAL_METRIC Rank_ICIR=0.413410
Portfolio analysis (csi300):
PORTFOLIO_METRIC annualized_return=0.037171
PORTFOLIO_METRIC max_drawdown=-0.098496
PORTFOLIO_METRIC information_ratio=0.478468
### csi100 (scripts/csi100.sh)
Signal analysis (csi100):
SIGNAL_METRIC IC=0.037466
SIGNAL_METRIC ICIR=0.210051
SIGNAL_METRIC Rank_IC=0.049303
SIGNAL_METRIC Rank_ICIR=0.259323
Portfolio analysis (csi100):
PORTFOLIO_METRIC annualized_return=0.002680
PORTFOLIO_METRIC max_drawdown=-0.209819
PORTFOLIO_METRIC information_ratio=0.042604
[Leaderboard] Results saved: {'ic_csi300': 0.045467, 'icir_csi300': 0.348241, 'rank_ic_csi300': 0.057171, 'rank_icir_csi300': 0.41341, 'annualized_return_csi300': 0.037171, 'max_drawdown_csi300': -0.098496, 'information_ratio_csi300': 0.478468, 'elapsed_csi300':
AssistantStep 3edit
Tool Input
1{
2 "op": "replace",
3 "filename": "qlib/custom_model.py",
4 "start_line": 58,
5 "end_line": 152,
6 "content": "# Concept-Residual Decomposition Network with IC-augmented training\nimport copy\nimport torch.optim as optim\nclass CRDNet(nn.Module):\n \"\"\"Decomposes stock returns into explicit-concept, implicit-peer, and individual components.\n Novel: iterative residual decomposition + orthogonal regularization + IC loss.\"\"\"\n def __init__(self, d=6, h=64, nl=2):\n super().__init__()\n self.rnn = nn.GRU(d, h, nl, batch_first=True); self.d = d; self.h = h\n # Explicit concept: stock -> concept -> stock\n self.fc_s2c = nn.Linear(h, h); self.fc_c2s = nn.Linear(h, h)\n # Implicit peer: self-attention on residual with separate Q, K, V\n self.fc_q = nn.Linear(h, h); self.fc_k = nn.Linear(h, h); self.fc_v = nn.Linear(h, h)\n # Individual component\n self.fc_ind = nn.Linear(h, h)\n # Output prediction head\n self.fc_out = nn.Linear(h, 1)\n for m in self.modules():\n if isinstance(m, nn.Linear):\n nn.init.xavier_uniform_(m.weight); nn.init.zeros_(m.bias)\n def forward(self, x, cm):\n x = x.reshape(len(x), self.d, -1).permute(0, 2, 1)\n h = self.rnn(x)[0][:, -1, :] # [N, H]\n # Stage 1: Explicit concept component\n cw = cm / (cm.sum(0, keepdim=True) + 1e-8)\n h_es = F.relu(self.fc_c2s(cm @ F.relu(self.fc_s2c(cw.t() @ h))))\n # Stage 2: Implicit peer component via attention on residual\n res = h - h_es\n q, k = self.fc_q(res), self.fc_k(res)\n attn = F.softmax(q @ k.t() / self.h**0.5, dim=1)\n h_is = F.relu(self.fc_v(attn @ res))\n # Stage 3: Individual stock-specific residual\n h_ind = F.relu(self.fc_ind(h - h_es - h_is))\n pred = self.fc_out(h_es + h_is + h_ind).squeeze(-1)\n return pred, h_ind, h_es + h_is\nclass CustomModel(Model):\n def __init__(self):\n super().__init__()\n self.net = CRDNet().to(DEVICE); self.fitted = False\n def _iter(self, df):\n feat, lab = df[\"feature\"], df[\"label\"].values.ravel()\n dc = df.groupby(level=0).size().values\n di = np.roll(np.cumsum(dc), 1); di[0] = 0\n for i, c in zip(di, dc):\n s = slice(i, i + c)\n f = torch.nan_to_num(torch.tensor(feat.values[s], dtype=torch.float32, device=DEVICE))\n l = torch.tensor(lab[s], dtype=torch.float32, device=DEVICE)\n inst = feat.index.get_level_values(\"instrument\")[s]\n cm = torch.tensor(get_concept_matrix(get_stock_index(inst)), device=DEVICE)\n m = ~torch.isnan(l)\n if m.sum() < 5: continue\n yield f, l, cm, m\n def _ic(self, p, l):\n pp, ll = p - p.mean(), l - l.mean()\n return (pp * ll).sum() / (pp.norm() * ll.norm() + 1e-8)\n def fit(self, dataset):\n df_tr = dataset.prepare(\"train\", col_set=[\"feature\", \"label\"], data_key=DataHandlerLP.DK_L)\n df_va = dataset.prepare(\"valid\", col_set=[\"feature\", \"label\"], data_key=DataHandlerLP.DK_L)\n opt = optim.Adam(self.net.parameters(), lr=2e-4, weight_decay=1e-5)\n best_ic, best_sd, wait = -1e9, None, 0\n for ep in range(200):\n self.net.train()\n for f, l, cm, m in self._iter(df_tr):\n p, hi, hg = self.net(f, cm)\n loss = F.mse_loss(p[m], l[m])\n # Orthogonal regularization: decorrelate individual vs shared\n loss += 0.05 * (F.normalize(hi[m], dim=1) * F.normalize(hg[m], dim=1)).sum(1).pow(2).mean()\n # IC loss: direct ranking quality optimization\n if m.sum() > 20: loss -= 0.1 * self._ic(p[m], l[m])\n opt.zero_grad(); loss.backward()\n nn.utils.clip_grad_norm_(self.net.parameters(), 1.0); opt.step()\n self.net.eval(); ics = []\n with torch.no_grad():\n for f, l, cm, m in self._iter(df_va):\n p = self.net(f, cm)[0]\n if m.sum() > 20:\n v = self._ic(p[m], l[m])\n if not torch.isnan(v): ics.append(v.item())\n vic = np.mean(ics) if ics else -1\n if vic > best_ic:\n best_ic, best_sd, wait = vic, copy.deepcopy(self.net.state_dict()), 0\n else:\n wait += 1\n if wait >= 20: break\n if best_sd: self.net.load_state_dict(best_sd)\n self.fitted = True\n def predict(self, dataset, segment=\"test\"):\n if not self.fitted: raise ValueError(\"Not fitted\")\n df = dataset.prepare(segment, col_set=[\"feature\", \"label\"], data_key=DataHandlerLP.DK_I)\n feat = df[\"feature\"]; self.net.eval(); preds = []\n dc = df.groupby(level=0).size().values\n di = np.roll(np.cumsum(dc), 1); di[0] = 0\n with torch.no_grad():\n for i, c in zip(di, dc):\n s = slice(i, i + c)\n f = torch.nan_to_num(torch.tensor(feat.values[s], dtype=torch.float32, device=DEVICE))\n inst = feat.index.get_level_values(\"instrument\")[s]\n cm = torch.tensor(get_concept_matrix(get_stock_index(inst)), device=DEVICE)\n preds.append(self.net(f, cm)[0].cpu().numpy())\n return pd.Series(np.concatenate(preds), index=feat.index, name=\"score\")\n"
7}Tool ResultStep 3
OK: Replaced lines 58..152 with 100 line(s) in qlib/custom_model.py. Editable range: 58–157.
[Current file: qlib/custom_model.py | editable: 58–157 | total: 157 lines]
55| return _stock2concept_matrix[stock_indices].astype(np.float32)
56|
57|
58 # Concept-Residual Decomposition Network with IC-augmented training
59 import copy
60 import torch.optim as optim
61 class CRDNet(nn.Module):
62 """Decomposes stock returns into explicit-concept, implicit-peer, and individual components.
63 Novel: iterative residual decomposition + orthogonal regularization + IC loss."""
64 def __init__(self, d=6, h=64, nl=2):
65 super().__init__()
66 self.rnn = nn.GRU(d, h, nl, batch_first=True); self.d = d; self.h = h
67 # Explicit concept: stock -> concept -> stock
68 self.fc_s2c = nn.Linear(h, h); self.fc_c2s = nn.Linear(h, h)
69 # Implicit peer: self-attention on residual with
AssistantStep 4test
Tool ResultStep 4
[Test #2] (1 test remaining; call submit(n=2) to submit this result)
### csi300 (scripts/csi300.sh)
Signal analysis (csi300):
SIGNAL_METRIC IC=0.031812
SIGNAL_METRIC ICIR=0.205669
SIGNAL_METRIC Rank_IC=0.043688
SIGNAL_METRIC Rank_ICIR=0.266492
Portfolio analysis (csi300):
PORTFOLIO_METRIC annualized_return=0.005508
PORTFOLIO_METRIC max_drawdown=-0.136220
PORTFOLIO_METRIC information_ratio=0.055631
### csi100 (scripts/csi100.sh)
Signal analysis (csi100):
SIGNAL_METRIC IC=0.043438
SIGNAL_METRIC ICIR=0.222726
SIGNAL_METRIC Rank_IC=0.052947
SIGNAL_METRIC Rank_ICIR=0.269370
Portfolio analysis (csi100):
PORTFOLIO_METRIC annualized_return=-0.000013
PORTFOLIO_METRIC max_drawdown=-0.184543
PORTFOLIO_METRIC information_ratio=-0.000206
[Leaderboard] Results saved: {'ic_csi300': 0.031812, 'icir_csi300': 0.205669, 'rank_ic_csi300': 0.043688, 'rank_icir_csi300': 0.266492, 'annualized_return_csi300': 0.005508, 'max_drawdown_csi300': -0.13622, 'information_ratio_csi300': 0.055631, 'elapsed_csi300':