Agent Conversation: google/gemini-3.1-pro-preview
Model: google/gemini-3.1-pro-preview
Total steps: 6
Messages: 13
UserStep 0
Task: quant-stock-prediction
Quantitative Stock Prediction on CSI300
Objective
Design and implement a stock prediction model that forecasts next-day returns for CSI300 stocks. Your code goes in custom_model.py. Three reference implementations (LightGBM, LSTM, Transformer) are provided as read-only.
Evaluation
Signal quality: IC, ICIR, Rank IC. Portfolio (TopkDropout, top 50, drop 5): Annualized Return, Max Drawdown, Information Ratio. Evaluation is automatic via qlib's workflow.
Workflow Configuration
workflow_config.yaml lines 13-25 and 31-44 are editable. This is the model plus input-adapter/preprocessor block: you may change the dataset class (e.g., to TSDatasetH) or processors if your model needs a different input view. Instruments, date ranges, train/valid/test splits, and evaluation settings are fixed.
qlib/custom_model.py [EDITABLE — lines 16–103 only]
1: # Custom stock prediction model for MLS-Bench
2: #
3: # EDITABLE section: CustomModel class with fit() and predict() methods.
4: # FIXED sections: imports below.
5: import numpy as np
6: import pandas as pd
7: import torch
8: import torch.nn as nn
9: import torch.nn.functional as F
10: from qlib.model.base import Model
11: from qlib.data.dataset import DatasetH
12: from qlib.data.dataset.handler import DataHandlerLP
13:
14: DEVICE = "cuda" if torch.cuda.is_available() else "cpu"
15:
16: # =====================================================================
17: # EDITABLE: CustomModel — implement your stock prediction model here
18: # =====================================================================
19: class CustomModel(Model):
20: """Custom stock prediction model.
21:
22: You must implement:
23: fit(dataset) — train the model on the training data
24: predict(dataset, segment="test") — return predictions as pd.Series
25:
26: The dataset is a qlib DatasetH with Alpha360 features (360 features per
27: stock per day). The 360 features come from 6 base features
28: (open/close/high/low/volume/vwap ratios) x 60 days of history.
29:
30: For temporal models, features can be reshaped:
31: x.reshape(N, 6, 60).permute(0, 2, 1) -> [N, 60, 6]
32: giving 60 time steps of 6 features each.
33:
34: Segments: "train", "valid", "test".
35:
36: Getting data from the dataset:
37: df_train = dataset.prepare("train", col_set=["feature", "label"],
38: data_key=DataHandlerLP.DK_L)
39: features = df_train["feature"] # DataFrame: (n_samples, 360)
40: labels = df_train["label"] # DataFrame: (n_samples, 1)
41:
42: The label is: Ref($close, -2) / Ref($close, -1) - 1
43: (i.e., the return from T+1 to T+2, predicted at time T)
44:
45: predicAssistantStep 1edit
Tool Input
1{
2 "end_line": 103,
3 "start_line": 16,
4 "op": "replace",
5 "content": "# =====================================================================\n# EDITABLE: CustomModel — implement your stock prediction model here\n# =====================================================================\nimport torch.optim as optim\nfrom torch.utils.data import TensorDataset, DataLoader\n\nclass Attention(nn.Module):\n def __init__(self, hidden_size):\n super().__init__()\n self.w = nn.Parameter(torch.Tensor(hidden_size, 1))\n nn.init.xavier_uniform_(self.w)\n\n def forward(self, x):\n e = torch.matmul(torch.tanh(x), self.w).squeeze(-1)\n alpha = F.softmax(e, dim=-1).unsqueeze(-1)\n return (x * alpha).sum(dim=1)\n\nclass CustomModelNN(nn.Module):\n def __init__(self, d_feat=6, hidden_size=64, num_layers=2, dropout=0.2):\n super().__init__()\n self.d_feat = d_feat\n \n self.feature_layer = nn.Linear(d_feat, hidden_size)\n \n self.conv1 = nn.Conv1d(hidden_size, hidden_size, kernel_size=3, padding=1)\n self.conv2 = nn.Conv1d(hidden_size, hidden_size, kernel_size=3, padding=2, dilation=2)\n \n self.gru = nn.GRU(\n input_size=hidden_size,\n hidden_size=hidden_size,\n num_layers=num_layers,\n batch_first=True,\n dropout=dropout,\n )\n self.attn = Attention(hidden_size)\n \n self.fc = nn.Sequential(\n nn.Linear(hidden_size, hidden_size // 2),\n nn.LayerNorm(hidden_size // 2),\n nn.GELU(),\n nn.Dropout(dropout),\n nn.Linear(hidden_size // 2, 1)\n )\n\n def forward(self, x):\n x = x.reshape(len(x), self.d_feat, -1).permute(0, 2, 1)\n x = self.feature_layer(x)\n \n x_conv = x.permute(0, 2, 1)\n x_c1 = F.gelu(self.conv1(x_conv))\n x_c2 = F.gelu(self.conv2(x_c1))\n \n x_rnn_in = x_c2.permute(0, 2, 1) + x\n \n out, _ = self.gru(x_rnn_in)\n out_attn = self.attn(out)\n \n return self.fc(out_attn).squeeze(-1)\n\nclass CustomModel(Model):\n def __init__(self, d_feat=6, hidden_size=64, num_layers=2, dropout=0.2, n_epochs=100, lr=1e-3, batch_size=2048, early_stop=20):\n super().__init__()\n self.d_feat = d_feat\n self.hidden_size = hidden_size\n self.num_layers = num_layers\n self.dropout = dropout\n self.n_epochs = n_epochs\n self.lr = lr\n self.batch_size = batch_size\n self.early_stop = early_stop\n \n self.device = torch.device(DEVICE)\n self.model = CustomModelNN(d_feat, hidden_size, num_layers, dropout).to(self.device)\n self.optimizer = optim.Adam(self.model.parameters(), lr=self.lr)\n self.fitted = False\n\n def loss_fn(self, pred, label):\n mask = torch.isfinite(label)\n p = pred[mask]\n l = label[mask]\n if len(p) == 0:\n return torch.tensor(0.0).to(self.device)\n \n mse = F.mse_loss(p, l)\n \n # CCC loss\n p_mean = p.mean()\n l_mean = l.mean()\n p_var = p.var(unbiased=False)\n l_var = l.var(unbiased=False)\n cov = ((p - p_mean) * (l - l_mean)).mean()\n ccc = 2 * cov / (p_var + l_var + (p_mean - l_mean)**2 + 1e-8)\n ccc_loss = 1 - ccc\n \n # Pairwise loss directly trains ranking capability\n if len(p) > 1:\n idx = torch.randperm(len(p))\n target = torch.sign(l - l[idx])\n p_mask = target != 0\n if p_mask.any():\n pair_loss = F.relu(-target[p_mask] * (p[p_mask] - p[idx][p_mask])).mean()\n else:\n pair_loss = torch.tensor(0.0, device=self.device)\n else:\n pair_loss = torch.tensor(0.0, device=self.device)\n \n return mse + 0.1 * ccc_loss + 0.1 * pair_loss\n\n def train_epoch(self, dataloader):\n self.model.train()\n total_loss = 0\n for x, y in dataloader:\n x, y = x.to(self.device), y.to(self.device)\n self.optimizer.zero_grad()\n pred = self.model(x)\n loss = self.loss_fn(pred, y)\n if loss.requires_grad:\n loss.backward()\n torch.nn.utils.clip_grad_norm_(self.model.parameters(), max_norm=1.0)\n self.optimizer.step()\n total_loss += loss.item()\n return total_loss / len(dataloader)\n\n def eval_epoch(self, dataloader):\n self.model.eval()\n total_loss = 0\n with torch.no_grad():\n for x, y in dataloader:\n x, y = x.to(self.device), y.to(self.device)\n pred = self.model(x)\n loss = self.loss_fn(pred, y)\n total_loss += loss.item()\n return total_loss / len(dataloader)\n\n def fit(self, dataset: DatasetH):\n df_train = dataset.prepare(\n \"train\", col_set=[\"feature\", \"label\"], data_key=DataHandlerLP.DK_L\n )\n df_valid = dataset.prepare(\n \"valid\", col_set=[\"feature\", \"label\"], data_key=DataHandlerLP.DK_L\n )\n \n x_train, y_train = df_train[\"feature\"].values, np.squeeze(df_train[\"label\"].values)\n train_mask = ~(np.isnan(x_train).any(axis=1) | np.isnan(y_train))\n x_train, y_train = x_train[train_mask], y_train[train_mask]\n \n x_valid, y_valid = df_valid[\"feature\"].values, np.squeeze(df_valid[\"label\"].values)\n valid_mask = ~(np.isnan(x_valid).any(axis=1) | np.isnan(y_valid))\n x_valid, y_valid = x_valid[valid_mask], y_valid[valid_mask]\n \n train_dataset = TensorDataset(torch.tensor(x_train, dtype=torch.float32), torch.tensor(y_train, dtype=torch.float32))\n valid_dataset = TensorDataset(torch.tensor(x_valid, dtype=torch.float32), torch.tensor(y_valid, dtype=torch.float32))\n \n train_loader = DataLoader(train_dataset, batch_size=self.batch_size, shuffle=True)\n valid_loader = DataLoader(valid_dataset, batch_size=self.batch_size, shuffle=False)\n \n best_loss = np.inf\n wait = 0\n best_model_state = None\n \n for epoch in range(self.n_epochs):\n train_loss = self.train_epoch(train_loader)\n valid_loss = self.eval_epoch(valid_loader)\n \n if valid_loss < best_loss:\n best_loss = valid_loss\n best_model_state = {k: v.cpu() for k, v in self.model.state_dict().items()}\n wait = 0\n else:\n wait += 1\n \n if wait >= self.early_stop:\n break\n \n if best_model_state is not None:\n self.model.load_state_dict(best_model_state)\n \n self.fitted = True\n\n def predict(self, dataset: DatasetH, segment=\"test\"):\n if not self.fitted:\n raise ValueError(\"Model is not fitted yet!\")\n \n df_test = dataset.prepare(\n segment, col_set=[\"feature\", \"label\"], data_key=DataHandlerLP.DK_I\n )\n features = df_test[\"feature\"]\n index = features.index\n \n features_np = features.values\n features_np = np.nan_to_num(features_np, nan=0.0)\n \n self.model.eval()\n preds = []\n with torch.no_grad():\n for i in range(0, len(features_np), self.batch_size):\n batch_x = torch.tensor(features_np[i:i+self.batch_size], dtype=torch.float32).to(self.device)\n batch_pred = self.model(batch_x)\n preds.append(batch_pred.cpu().numpy())\n \n preds = np.concatenate(preds)\n return pd.Series(preds, index=index, name=\"score\")",
6 "filename": "qlib/custom_model.py"
7}Tool ResultStep 1
OK: Replaced lines 16..103 with 202 line(s) in qlib/custom_model.py. Editable range: 16–217.
[Current file: qlib/custom_model.py | editable: 16–217 | total: 217 lines]
13|
14| DEVICE = "cuda" if torch.cuda.is_available() else "cpu"
15|
16 # =====================================================================
17 # EDITABLE: CustomModel — implement your stock prediction model here
18 # =====================================================================
19 import torch.optim as optim
20 from torch.utils.data import TensorDataset, DataLoader
21
22 class Attention(nn.Module):
23 def __init__(self, hidden_size):
24 super().__init__()
25 self.w = nn.Parameter(torch.Tensor(hidden_size, 1))
26 nn.init.xavier_uniform_(self.w)
27
28 def forward(self, x):
29 e = torch.matmul(torch.tanh(x), self.w).squeeze(-1)
30 alpha = F.softmax(e, dim=-1).unsqueeze(-1)
AssistantStep 2test
Tool ResultStep 2
[Test #1] (2 tests remaining; call submit(n=1) to submit this result)
### csi300 (scripts/csi300.sh)
Signal analysis (csi300):
SIGNAL_METRIC IC=0.045733
SIGNAL_METRIC ICIR=0.335620
SIGNAL_METRIC Rank_IC=0.057307
SIGNAL_METRIC Rank_ICIR=0.441469
Portfolio analysis (csi300):
PORTFOLIO_METRIC annualized_return=0.083147
PORTFOLIO_METRIC max_drawdown=-0.057711
PORTFOLIO_METRIC information_ratio=1.266606
### csi100 (scripts/csi100.sh)
Signal analysis (csi100):
SIGNAL_METRIC IC=0.034219
SIGNAL_METRIC ICIR=0.182490
SIGNAL_METRIC Rank_IC=0.046379
SIGNAL_METRIC Rank_ICIR=0.261489
Portfolio analysis (csi100):
PORTFOLIO_METRIC annualized_return=-0.001599
PORTFOLIO_METRIC max_drawdown=-0.198085
PORTFOLIO_METRIC information_ratio=-0.028370
[Leaderboard] Results saved: {'ic_csi300': 0.045733, 'icir_csi300': 0.33562, 'rank_ic_csi300': 0.057307, 'rank_icir_csi300': 0.441469, 'annualized_return_csi300': 0.083147, 'max_drawdown_csi300': -0.057711, 'information_ratio_csi300': 1.266606, 'elapsed_csi300'
AssistantStep 3edit
Tool Input
1{
2 "op": "replace",
3 "content": "# =====================================================================\n# EDITABLE: CustomModel — implement your stock prediction model here\n# =====================================================================\nimport torch.optim as optim\nfrom torch.utils.data import Dataset as TorchDataset, DataLoader\n\nclass Attention(nn.Module):\n def __init__(self, hidden_size):\n super().__init__()\n self.w = nn.Parameter(torch.Tensor(hidden_size, 1))\n nn.init.xavier_uniform_(self.w)\n\n def forward(self, x):\n e = torch.matmul(torch.tanh(x), self.w).squeeze(-1)\n alpha = F.softmax(e, dim=-1).unsqueeze(-1)\n return (x * alpha).sum(dim=1)\n\nclass CustomModelNN(nn.Module):\n def __init__(self, d_feat=6, hidden_size=64, num_layers=2, dropout=0.2):\n super().__init__()\n self.d_feat = d_feat\n \n self.feature_layer = nn.Linear(d_feat, hidden_size)\n \n self.conv1 = nn.Conv1d(hidden_size, hidden_size, kernel_size=3, padding=1)\n self.conv2 = nn.Conv1d(hidden_size, hidden_size, kernel_size=3, padding=2, dilation=2)\n \n self.gru = nn.GRU(\n input_size=hidden_size,\n hidden_size=hidden_size,\n num_layers=num_layers,\n batch_first=True,\n dropout=dropout,\n )\n self.attn = Attention(hidden_size)\n \n for name, param in self.gru.named_parameters():\n if 'weight_ih' in name:\n nn.init.xavier_uniform_(param.data)\n elif 'weight_hh' in name:\n nn.init.orthogonal_(param.data)\n \n self.fc = nn.Sequential(\n nn.Linear(hidden_size, hidden_size // 2),\n nn.LayerNorm(hidden_size // 2),\n nn.GELU(),\n nn.Dropout(dropout),\n nn.Linear(hidden_size // 2, 1)\n )\n\n def forward(self, x):\n x = x.reshape(len(x), self.d_feat, -1).permute(0, 2, 1)\n x = self.feature_layer(x)\n \n x_conv = x.permute(0, 2, 1)\n x_c1 = F.gelu(self.conv1(x_conv))\n x_c2 = F.gelu(self.conv2(x_c1))\n \n x_rnn_in = x_c2.permute(0, 2, 1) + x\n \n out, _ = self.gru(x_rnn_in)\n out_attn = self.attn(out)\n \n return self.fc(out_attn).squeeze(-1)\n\nclass DailyDataset(TorchDataset):\n def __init__(self, df):\n self.days = []\n for date, group in df.groupby(level=\"datetime\"):\n if len(group) == 0:\n continue\n x = group[\"feature\"].values\n y = np.squeeze(group[\"label\"].values)\n if y.ndim == 0:\n y = np.expand_dims(y, 0)\n self.days.append((torch.tensor(x, dtype=torch.float32), torch.tensor(y, dtype=torch.float32)))\n \n def __len__(self):\n return len(self.days)\n \n def __getitem__(self, idx):\n return self.days[idx]\n\ndef collate_fn(batch):\n return batch\n\nclass CustomModel(Model):\n def __init__(self, d_feat=6, hidden_size=64, num_layers=2, dropout=0.2, n_epochs=100, lr=1e-3, batch_size=16, early_stop=15):\n super().__init__()\n self.d_feat = d_feat\n self.hidden_size = hidden_size\n self.num_layers = num_layers\n self.dropout = dropout\n self.n_epochs = n_epochs\n self.lr = lr\n self.batch_size = batch_size\n self.early_stop = early_stop\n \n self.device = torch.device(DEVICE)\n self.model = CustomModelNN(d_feat, hidden_size, num_layers, dropout).to(self.device)\n self.optimizer = optim.Adam(self.model.parameters(), lr=self.lr, weight_decay=1e-5)\n self.fitted = False\n\n def daily_loss_fn(self, p, l):\n if len(p) < 2:\n return torch.tensor(0.0).to(self.device)\n \n mse = F.mse_loss(p, l)\n \n p_mean = p.mean()\n l_mean = l.mean()\n p_norm = p - p_mean\n l_norm = l - l_mean\n cov = (p_norm * l_norm).mean()\n p_std = p.std(unbiased=False) + 1e-6\n l_std = l.std(unbiased=False) + 1e-6\n ic = cov / (p_std * l_std)\n ic_loss = 1.0 - ic\n \n idx = torch.randperm(len(p))\n target = torch.sign(l - l[idx])\n p_mask = target != 0\n if p_mask.any():\n pair_loss = F.softplus(-target[p_mask] * (p[p_mask] - p[idx][p_mask])).mean()\n else:\n pair_loss = torch.tensor(0.0, device=self.device)\n \n return mse + 0.5 * ic_loss + 0.5 * pair_loss\n\n def train_epoch(self, dataloader):\n self.model.train()\n total_loss = 0\n n_days = 0\n \n for batch in dataloader:\n xs, ys = zip(*batch)\n lengths = [len(y) for y in ys]\n x_cat = torch.cat(xs, dim=0).to(self.device)\n y_cat = torch.cat(ys, dim=0).to(self.device)\n \n self.optimizer.zero_grad()\n pred_cat = self.model(x_cat)\n \n preds = torch.split(pred_cat, lengths)\n labels = torch.split(y_cat, lengths)\n \n loss = 0\n valid_days = 0\n for p, l in zip(preds, labels):\n d_loss = self.daily_loss_fn(p, l)\n if d_loss.requires_grad:\n loss += d_loss\n valid_days += 1\n \n if valid_days > 0:\n loss = loss / valid_days\n loss.backward()\n torch.nn.utils.clip_grad_norm_(self.model.parameters(), max_norm=1.0)\n self.optimizer.step()\n total_loss += loss.item() * valid_days\n n_days += valid_days\n \n return total_loss / max(n_days, 1)\n\n def eval_epoch(self, dataloader):\n self.model.eval()\n total_loss = 0\n n_days = 0\n \n with torch.no_grad():\n for batch in dataloader:\n xs, ys = zip(*batch)\n lengths = [len(y) for y in ys]\n x_cat = torch.cat(xs, dim=0).to(self.device)\n y_cat = torch.cat(ys, dim=0).to(self.device)\n \n pred_cat = self.model(x_cat)\n preds = torch.split(pred_cat, lengths)\n labels = torch.split(y_cat, lengths)\n \n for p, l in zip(preds, labels):\n d_loss = self.daily_loss_fn(p, l)\n if d_loss.requires_grad or d_loss.item() > 0: \n total_loss += d_loss.item()\n n_days += 1\n \n return total_loss / max(n_days, 1)\n\n def fit(self, dataset: DatasetH):\n df_train = dataset.prepare(\n \"train\", col_set=[\"feature\", \"label\"], data_key=DataHandlerLP.DK_L\n )\n df_valid = dataset.prepare(\n \"valid\", col_set=[\"feature\", \"label\"], data_key=DataHandlerLP.DK_L\n )\n \n features_np = df_train[\"feature\"].values\n labels_np = df_train[\"label\"].values.reshape(len(df_train), -1)\n train_mask = ~(np.isnan(features_np).any(axis=1) | np.isnan(labels_np).any(axis=1))\n df_train = df_train[train_mask]\n \n features_np = df_valid[\"feature\"].values\n labels_np = df_valid[\"label\"].values.reshape(len(df_valid), -1)\n valid_mask = ~(np.isnan(features_np).any(axis=1) | np.isnan(labels_np).any(axis=1))\n df_valid = df_valid[valid_mask]\n \n train_dataset = DailyDataset(df_train)\n valid_dataset = DailyDataset(df_valid)\n \n train_loader = DataLoader(train_dataset, batch_size=self.batch_size, shuffle=True, collate_fn=collate_fn)\n valid_loader = DataLoader(valid_dataset, batch_size=self.batch_size, shuffle=False, collate_fn=collate_fn)\n \n best_loss = np.inf\n wait = 0\n best_model_state = None\n \n for epoch in range(self.n_epochs):\n train_loss = self.train_epoch(train_loader)\n valid_loss = self.eval_epoch(valid_loader)\n \n if valid_loss < best_loss:\n best_loss = valid_loss\n best_model_state = {k: v.cpu() for k, v in self.model.state_dict().items()}\n wait = 0\n else:\n wait += 1\n \n if wait >= self.early_stop:\n break\n \n if best_model_state is not None:\n self.model.load_state_dict(best_model_state)\n \n self.fitted = True\n\n def predict(self, dataset: DatasetH, segment=\"test\"):\n if not self.fitted:\n raise ValueError(\"Model is not fitted yet!\")\n \n df_test = dataset.prepare(\n segment, col_set=[\"feature\", \"label\"], data_key=DataHandlerLP.DK_I\n )\n features = df_test[\"feature\"]\n index = features.index\n \n features_np = features.values\n features_np = np.nan_to_num(features_np, nan=0.0)\n \n self.model.eval()\n preds = []\n batch_size = 2048\n with torch.no_grad():\n for i in range(0, len(features_np), batch_size):\n batch_x = torch.tensor(features_np[i:i+batch_size], dtype=torch.float32).to(self.device)\n batch_pred = self.model(batch_x)\n preds.append(batch_pred.cpu().numpy())\n \n preds = np.concatenate(preds)\n return pd.Series(preds, index=index, name=\"score\")",
4 "end_line": 217,
5 "filename": "qlib/custom_model.py",
6 "start_line": 16
7}Tool ResultStep 3
OK: Replaced lines 16..217 with 259 line(s) in qlib/custom_model.py. Editable range: 16–274.
[Current file: qlib/custom_model.py | editable: 16–274 | total: 274 lines]
13|
14| DEVICE = "cuda" if torch.cuda.is_available() else "cpu"
15|
16 # =====================================================================
17 # EDITABLE: CustomModel — implement your stock prediction model here
18 # =====================================================================
19 import torch.optim as optim
20 from torch.utils.data import Dataset as TorchDataset, DataLoader
21
22 class Attention(nn.Module):
23 def __init__(self, hidden_size):
24 super().__init__()
25 self.w = nn.Parameter(torch.Tensor(hidden_size, 1))
26 nn.init.xavier_uniform_(self.w)
27
28 def forward(self, x):
29 e = torch.matmul(torch.tanh(x), self.w).squeeze(-1)
30 alpha = F.softmax(e, dim=-1).unsque
AssistantStep 4test
Tool ResultStep 4
[Test #2] (1 test remaining; call submit(n=2) to submit this result)
### csi300 (scripts/csi300.sh)
Signal analysis (csi300):
SIGNAL_METRIC IC=0.046450
SIGNAL_METRIC ICIR=0.325133
SIGNAL_METRIC Rank_IC=0.059459
SIGNAL_METRIC Rank_ICIR=0.442866
Portfolio analysis (csi300):
PORTFOLIO_METRIC annualized_return=0.060986
PORTFOLIO_METRIC max_drawdown=-0.071555
PORTFOLIO_METRIC information_ratio=0.960617
### csi100 (scripts/csi100.sh)
Signal analysis (csi100):
SIGNAL_METRIC IC=0.039541
SIGNAL_METRIC ICIR=0.205669
SIGNAL_METRIC Rank_IC=0.052420
SIGNAL_METRIC Rank_ICIR=0.278416
Portfolio analysis (csi100):
PORTFOLIO_METRIC annualized_return=0.008189
PORTFOLIO_METRIC max_drawdown=-0.129454
PORTFOLIO_METRIC information_ratio=0.143560
[Leaderboard] Results saved: {'ic_csi300': 0.04645, 'icir_csi300': 0.325133, 'rank_ic_csi300': 0.059459, 'rank_icir_csi300': 0.442866, 'annualized_return_csi300': 0.060986, 'max_drawdown_csi300': -0.071555, 'information_ratio_csi300': 0.960617, 'elapsed_csi300': 1
AssistantStep 5edit
Tool Input
1{
2 "filename": "qlib/custom_model.py",
3 "end_line": 274,
4 "start_line": 16,
5 "op": "replace",
6 "content": "# =====================================================================\n# EDITABLE: CustomModel — implement your stock prediction model here\n# =====================================================================\nimport torch.optim as optim\nfrom torch.utils.data import Dataset as TorchDataset, DataLoader\n\nclass Attention(nn.Module):\n def __init__(self, hidden_size):\n super().__init__()\n self.w = nn.Parameter(torch.Tensor(hidden_size, 1))\n nn.init.xavier_uniform_(self.w)\n\n def forward(self, x):\n e = torch.matmul(torch.tanh(x), self.w).squeeze(-1)\n alpha = F.softmax(e, dim=-1).unsqueeze(-1)\n return (x * alpha).sum(dim=1)\n\nclass CustomModelNN(nn.Module):\n def __init__(self, d_feat=6, hidden_size=64, num_layers=2, dropout=0.2):\n super().__init__()\n self.d_feat = d_feat\n \n self.feature_layer = nn.Linear(d_feat, hidden_size)\n \n self.conv1 = nn.Conv1d(hidden_size, hidden_size, kernel_size=3, padding=1)\n self.conv2 = nn.Conv1d(hidden_size, hidden_size, kernel_size=3, padding=2, dilation=2)\n \n # Bidirectional GRU for fully exploiting the whole history window\n self.gru = nn.GRU(\n input_size=hidden_size,\n hidden_size=hidden_size // 2,\n num_layers=num_layers,\n batch_first=True,\n dropout=dropout,\n bidirectional=True\n )\n self.attn = Attention(hidden_size)\n \n for name, param in self.gru.named_parameters():\n if 'weight_ih' in name:\n nn.init.xavier_uniform_(param.data)\n elif 'weight_hh' in name:\n nn.init.orthogonal_(param.data)\n \n self.fc = nn.Sequential(\n nn.Linear(hidden_size, hidden_size // 2),\n nn.LayerNorm(hidden_size // 2),\n nn.GELU(),\n nn.Dropout(dropout),\n nn.Linear(hidden_size // 2, 1)\n )\n\n def forward(self, x):\n x = x.reshape(len(x), self.d_feat, -1).permute(0, 2, 1)\n x = self.feature_layer(x)\n \n x_conv = x.permute(0, 2, 1)\n x_c1 = F.gelu(self.conv1(x_conv))\n x_c2 = F.gelu(self.conv2(x_c1))\n \n x_rnn_in = x_c2.permute(0, 2, 1) + x\n \n out, _ = self.gru(x_rnn_in)\n out_attn = self.attn(out)\n \n return self.fc(out_attn).squeeze(-1)\n\nclass DailyDataset(TorchDataset):\n def __init__(self, df):\n self.days = []\n for date, group in df.groupby(level=\"datetime\"):\n if len(group) == 0:\n continue\n x = group[\"feature\"].values\n y = np.squeeze(group[\"label\"].values)\n if y.ndim == 0:\n y = np.expand_dims(y, 0)\n self.days.append((torch.tensor(x, dtype=torch.float32), torch.tensor(y, dtype=torch.float32)))\n \n def __len__(self):\n return len(self.days)\n \n def __getitem__(self, idx):\n return self.days[idx]\n\ndef collate_fn(batch):\n return batch\n\nclass CustomModel(Model):\n def __init__(self, d_feat=6, hidden_size=64, num_layers=2, dropout=0.2, n_epochs=100, lr=1e-3, batch_size=32, early_stop=15):\n super().__init__()\n self.d_feat = d_feat\n self.hidden_size = hidden_size\n self.num_layers = num_layers\n self.dropout = dropout\n self.n_epochs = n_epochs\n self.lr = lr\n self.batch_size = batch_size\n self.early_stop = early_stop\n \n self.device = torch.device(DEVICE)\n self.model = CustomModelNN(d_feat, hidden_size, num_layers, dropout).to(self.device)\n self.optimizer = optim.Adam(self.model.parameters(), lr=self.lr, weight_decay=1e-5)\n self.fitted = False\n\n def daily_loss_fn(self, p, l):\n if len(p) < 2:\n return torch.tensor(0.0).to(self.device)\n \n mse = F.mse_loss(p, l)\n \n p_mean = p.mean()\n l_mean = l.mean()\n p_std = p.std(unbiased=False) + 1e-6\n l_std = l.std(unbiased=False) + 1e-6\n \n p_norm = (p - p_mean) / p_std\n l_norm = (l - l_mean) / l_std\n \n cov = (p_norm * l_norm).mean()\n ic_loss = 1.0 - cov\n \n k = max(int(len(p) * 0.2), 1)\n if len(p) >= 2 * k:\n _, top_idx = torch.topk(l_norm, k)\n _, bot_idx = torch.topk(-l_norm, k)\n p_top = p_norm[top_idx]\n p_bot = p_norm[bot_idx]\n \n # Explicit margin objective: expected spread of normalized scores\n margin = (l_norm[top_idx].mean() - l_norm[bot_idx].mean()).detach()\n tb_loss = F.relu(margin - (p_top - p_bot)).mean()\n else:\n tb_loss = torch.tensor(0.0, device=self.device)\n \n return mse + 0.5 * ic_loss + 0.5 * tb_loss\n\n def train_epoch(self, dataloader):\n self.model.train()\n total_loss = 0\n n_days = 0\n \n for batch in dataloader:\n xs, ys = zip(*batch)\n lengths = [len(y) for y in ys]\n x_cat = torch.cat(xs, dim=0).to(self.device)\n y_cat = torch.cat(ys, dim=0).to(self.device)\n \n self.optimizer.zero_grad()\n pred_cat = self.model(x_cat)\n \n preds = torch.split(pred_cat, lengths)\n labels = torch.split(y_cat, lengths)\n \n loss = 0\n valid_days = 0\n for p, l in zip(preds, labels):\n d_loss = self.daily_loss_fn(p, l)\n if d_loss.requires_grad:\n loss += d_loss\n valid_days += 1\n \n if valid_days > 0:\n loss = loss / valid_days\n loss.backward()\n torch.nn.utils.clip_grad_norm_(self.model.parameters(), max_norm=1.0)\n self.optimizer.step()\n total_loss += loss.item() * valid_days\n n_days += valid_days\n \n return total_loss / max(n_days, 1)\n\n def eval_epoch(self, dataloader):\n self.model.eval()\n total_loss = 0\n n_days = 0\n \n with torch.no_grad():\n for batch in dataloader:\n xs, ys = zip(*batch)\n lengths = [len(y) for y in ys]\n x_cat = torch.cat(xs, dim=0).to(self.device)\n y_cat = torch.cat(ys, dim=0).to(self.device)\n \n pred_cat = self.model(x_cat)\n preds = torch.split(pred_cat, lengths)\n labels = torch.split(y_cat, lengths)\n \n for p, l in zip(preds, labels):\n d_loss = self.daily_loss_fn(p, l)\n if d_loss.requires_grad or d_loss.item() > 0: \n total_loss += d_loss.item()\n n_days += 1\n \n return total_loss / max(n_days, 1)\n\n def fit(self, dataset: DatasetH):\n df_train = dataset.prepare(\n \"train\", col_set=[\"feature\", \"label\"], data_key=DataHandlerLP.DK_L\n )\n df_valid = dataset.prepare(\n \"valid\", col_set=[\"feature\", \"label\"], data_key=DataHandlerLP.DK_L\n )\n \n features_np = df_train[\"feature\"].values\n labels_np = df_train[\"label\"].values.reshape(len(df_train), -1)\n train_mask = ~(np.isnan(features_np).any(axis=1) | np.isnan(labels_np).any(axis=1))\n df_train = df_train[train_mask]\n \n features_np = df_valid[\"feature\"].values\n labels_np = df_valid[\"label\"].values.reshape(len(df_valid), -1)\n valid_mask = ~(np.isnan(features_np).any(axis=1) | np.isnan(labels_np).any(axis=1))\n df_valid = df_valid[valid_mask]\n \n train_dataset = DailyDataset(df_train)\n valid_dataset = DailyDataset(df_valid)\n \n train_loader = DataLoader(train_dataset, batch_size=self.batch_size, shuffle=True, collate_fn=collate_fn)\n valid_loader = DataLoader(valid_dataset, batch_size=self.batch_size, shuffle=False, collate_fn=collate_fn)\n \n best_loss = np.inf\n wait = 0\n best_model_state = None\n \n for epoch in range(self.n_epochs):\n train_loss = self.train_epoch(train_loader)\n valid_loss = self.eval_epoch(valid_loader)\n \n if valid_loss < best_loss:\n best_loss = valid_loss\n best_model_state = {k: v.cpu() for k, v in self.model.state_dict().items()}\n wait = 0\n else:\n wait += 1\n \n if wait >= self.early_stop:\n break\n \n if best_model_state is not None:\n self.model.load_state_dict(best_model_state)\n \n self.fitted = True\n\n def predict(self, dataset: DatasetH, segment=\"test\"):\n if not self.fitted:\n raise ValueError(\"Model is not fitted yet!\")\n \n df_test = dataset.prepare(\n segment, col_set=[\"feature\", \"label\"], data_key=DataHandlerLP.DK_I\n )\n features = df_test[\"feature\"]\n index = features.index\n \n features_np = features.values\n features_np = np.nan_to_num(features_np, nan=0.0)\n \n self.model.eval()\n preds = []\n batch_size = 4096\n with torch.no_grad():\n for i in range(0, len(features_np), batch_size):\n batch_x = torch.tensor(features_np[i:i+batch_size], dtype=torch.float32).to(self.device)\n batch_pred = self.model(batch_x)\n preds.append(batch_pred.cpu().numpy())\n \n preds = np.concatenate(preds)\n return pd.Series(preds, index=index, name=\"score\")"
7}Tool ResultStep 5
OK: Replaced lines 16..274 with 267 line(s) in qlib/custom_model.py. Editable range: 16–282.
[Current file: qlib/custom_model.py | editable: 16–282 | total: 282 lines]
13|
14| DEVICE = "cuda" if torch.cuda.is_available() else "cpu"
15|
16 # =====================================================================
17 # EDITABLE: CustomModel — implement your stock prediction model here
18 # =====================================================================
19 import torch.optim as optim
20 from torch.utils.data import Dataset as TorchDataset, DataLoader
21
22 class Attention(nn.Module):
23 def __init__(self, hidden_size):
24 super().__init__()
25 self.w = nn.Parameter(torch.Tensor(hidden_size, 1))
26 nn.init.xavier_uniform_(self.w)
27
28 def forward(self, x):
29 e = torch.matmul(torch.tanh(x), self.w).squeeze(-1)
30 alpha = F.softmax(e, dim=-1).unsque
AssistantStep 6test
Tool ResultStep 6
[Test #3] (0 tests remaining) ### csi300 (scripts/csi300.sh) Signal analysis (csi300): SIGNAL_METRIC IC=0.048557 SIGNAL_METRIC ICIR=0.337149 SIGNAL_METRIC Rank_IC=0.064236 SIGNAL_METRIC Rank_ICIR=0.466806 Portfolio analysis (csi300): PORTFOLIO_METRIC annualized_return=0.082884 PORTFOLIO_METRIC max_drawdown=-0.109733 PORTFOLIO_METRIC information_ratio=1.056978 ### csi100 (scripts/csi100.sh) Signal analysis (csi100): SIGNAL_METRIC IC=0.047576 SIGNAL_METRIC ICIR=0.258331 SIGNAL_METRIC Rank_IC=0.058961 SIGNAL_METRIC Rank_ICIR=0.338480 Portfolio analysis (csi100): PORTFOLIO_METRIC annualized_return=-0.000689 PORTFOLIO_METRIC max_drawdown=-0.115476 PORTFOLIO_METRIC information_ratio=-0.011614 ### csi300_recent (scripts/csi300_recent.sh) Signal analysis (csi300_recent): SIGNAL_METRIC IC=0.029322 SIGNAL_METRIC ICIR=0.225186 SIGNAL_METRIC Rank_IC=0.047551 SIGNAL_METRIC Rank_ICIR=0.351356 Portfolio analysis (csi300_recent): PORTFOLIO_METRIC annualized_return=0.097850 PORTFOLIO_METRIC max_draw