Agent Conversation: openai/gpt-5.4-pro
Model: openai/gpt-5.4-pro
Total steps: 5
Messages: 11
UserStep 0
Task: ts-short-term-forecast
Short-Term Time Series Forecasting: Custom Model Design
Objective
Design and implement a custom deep learning model for univariate short-term time series forecasting on the M4 dataset. Your code goes in the Model class in models/Custom.py. Three reference implementations (DLinear, TimesNet, PatchTST) are provided as read-only.
Evaluation
Trained and evaluated on three M4 seasonal patterns:
- Monthly (pred_len=18, seq_len=104)
- Quarterly (pred_len=8, seq_len=52)
- Yearly (pred_len=6, seq_len=42)
All use enc_in=1, features=M, loss=SMAPE. Metric: SMAPE (lower is better).
Time-Series-Library/models/Custom.py [EDITABLE — entire file only]
1: import torch
2: import torch.nn as nn
3:
4:
5: class Model(nn.Module):
6: """
7: Custom model for short-term time series forecasting (M4 dataset).
8:
9: Forward signature: forward(x_enc, x_mark_enc, x_dec, x_mark_dec, mask=None)
10: - x_enc: [batch, seq_len, enc_in] — input time series (enc_in=1 for M4)
11: - x_mark_enc: [batch, seq_len, time_features] — time feature encoding
12: - x_dec: [batch, label_len+pred_len, dec_in] — decoder input
13: - x_mark_dec: [batch, label_len+pred_len, time_features] — decoder time features
14:
15: Must return: [batch, pred_len, c_out] for forecasting
16: """
17:
18: def __init__(self, configs):
19: super(Model, self).__init__()
20: self.task_name = configs.task_name
21: self.seq_len = configs.seq_len
22: self.pred_len = configs.pred_len
23: self.enc_in = configs.enc_in
24: self.c_out = configs.c_out
25: # TODO: Define your model architecture here
26:
27: def forecast(self, x_enc, x_mark_enc, x_dec, x_mark_dec):
28: """
29: Forecasting: given input sequence, predict future values.
30: Input: x_enc [batch, seq_len, 1]
31: Output: [batch, pred_len, 1]
32: """
33: # TODO: Implement your forecasting logic
34: batch_size = x_enc.shape[0]
35: return torch.zeros(batch_size, self.pred_len, self.c_out).to(x_enc.device)
36:
37: def forward(self, x_enc, x_mark_enc, x_dec, x_mark_dec, mask=None):
38: if self.task_name == 'long_term_forecast' or self.task_name == 'short_term_forecast':
39: dec_out = self.forecast(x_enc, x_mark_enc, x_dec, x_mark_dec)
40: return dec_out[:, -self.pred_len:, :]
41: return None
Time-Series-Library/models/DLinear.py [READ-ONLY — do not edit]
1: import torch
2: import torch.nn as nn
3: import torch.nn.functional as F
4: from layers.Autoformer_EncDec import series_decomp
5:
6:
7: class Model(nn.Module):
8: """
9: Paper link: https://aAssistantStep 1edit
Tool Input
1{
2 "op": "replace",
3 "filename": "Time-Series-Library/models/Custom.py",
4 "content": "import math\nimport torch\nimport torch.nn as nn\nimport torch.nn.functional as F\n\n\nclass Model(nn.Module):\n \"\"\"\n Spectral-Phase Mixture forecaster.\n\n Core idea:\n 1) normalize around the last value to model short-term deltas,\n 2) extract a smooth trend with a learnable multi-kernel moving average,\n 3) model residual dynamics with a mixture of\n - local linear correction,\n - autocorrelation-weighted seasonal prototypes,\n - FFT-based spectral continuation,\n - phase retrieval from calendar/time features,\n 4) use sample-adaptive gates to combine the branches.\n \"\"\"\n\n def __init__(self, configs):\n super(Model, self).__init__()\n self.task_name = configs.task_name\n self.seq_len = configs.seq_len\n self.pred_len = configs.pred_len\n self.enc_in = configs.enc_in\n self.c_out = configs.c_out\n self.eps = 1e-5\n\n # Global and trend branches.\n self.direct_linear = nn.Linear(self.seq_len, self.pred_len)\n self.trend_linear = nn.Linear(self.seq_len, self.pred_len)\n\n # Short local correction branch on the residual.\n self.local_len = min(self.seq_len, max(12, 2 * self.pred_len))\n self.local_linear = nn.Linear(self.local_len, self.pred_len)\n\n # Multi-scale trend extraction.\n self.trend_kernels = self._build_trend_kernels(self.seq_len)\n self.trend_kernel_logits = nn.Parameter(torch.zeros(len(self.trend_kernels)))\n\n # Adaptive seasonal prototype branch.\n self.periods = self._build_periods(self.seq_len, self.pred_len)\n self.num_periods = len(self.periods)\n self.period_bias = nn.Parameter(torch.zeros(self.num_periods))\n self.period_decay = nn.Parameter(torch.full((self.num_periods,), -2.0))\n self.period_logit_scale = nn.Parameter(torch.tensor(1.0))\n\n # Spectral continuation branch.\n self.spectral_topk = min(6, max(2, self.seq_len // 16 + 2))\n self.spectral_rank = nn.Parameter(torch.zeros(self.spectral_topk))\n self.spectral_damping = nn.Parameter(torch.tensor(-1.5))\n\n # Trend extrapolation helpers.\n self.trend_window = min(self.seq_len, max(12, 2 * self.pred_len))\n trend_grid = torch.arange(self.trend_window).float()\n trend_grid = trend_grid - trend_grid.mean()\n self.register_buffer(\n 'trend_grid', trend_grid.view(1, self.trend_window, 1), persistent=False\n )\n self.register_buffer(\n 'trend_grid_denom', torch.tensor(float((trend_grid ** 2).sum().item() + 1e-6)), persistent=False\n )\n self.register_buffer(\n 'horizon', torch.arange(1, self.pred_len + 1).float().view(1, self.pred_len, 1), persistent=False\n )\n self.trend_slope_weight = nn.Parameter(torch.tensor(0.5))\n\n # Phase retrieval branch using raw time features.\n self.mark_temperature = nn.Parameter(torch.tensor(-1.0))\n self.mark_recency = nn.Parameter(torch.tensor(0.0))\n\n # Sample-adaptive gates.\n gate_dim = 6 + self.num_periods\n gate_hidden = max(16, min(64, getattr(configs, 'd_model', 32)))\n self.base_gate = nn.Sequential(\n nn.Linear(gate_dim, gate_hidden),\n nn.GELU(),\n nn.Linear(gate_hidden, 2)\n )\n self.resid_gate = nn.Sequential(\n nn.Linear(gate_dim, gate_hidden),\n nn.GELU(),\n nn.Linear(gate_hidden, 4)\n )\n self.residual_scale = nn.Parameter(torch.tensor(-1.0))\n\n # Optional output projection when c_out != enc_in.\n self.output_projection = nn.Identity()\n if self.c_out != self.enc_in:\n self.output_projection = nn.Linear(self.enc_in, self.c_out)\n\n self._reset_parameters()\n\n @staticmethod\n def _build_trend_kernels(seq_len):\n kernels = []\n for k in [3, 5, 7, 11, 17]:\n if k < seq_len:\n kernels.append(k if k % 2 == 1 else k + 1)\n if not kernels:\n kernels = [1]\n return kernels\n\n @staticmethod\n def _build_periods(seq_len, pred_len):\n candidates = {\n 1, 2, 3, 4, 6, 8, 12, 24,\n max(1, pred_len // 2),\n pred_len,\n min(seq_len - 1, pred_len * 2),\n }\n periods = []\n for p in sorted(candidates):\n if p <= 0 or p >= seq_len:\n continue\n if (seq_len // p) >= 2 or p <= pred_len:\n periods.append(p)\n return periods if periods else [1]\n\n def _reset_parameters(self):\n with torch.no_grad():\n self.direct_linear.weight.fill_(1.0 / max(self.seq_len, 1))\n self.direct_linear.bias.zero_()\n self.trend_linear.weight.fill_(1.0 / max(self.seq_len, 1))\n self.trend_linear.bias.zero_()\n self.local_linear.weight.zero_()\n self.local_linear.bias.zero_()\n\n nn.init.zeros_(self.base_gate[-1].weight)\n nn.init.zeros_(self.base_gate[-1].bias)\n nn.init.zeros_(self.resid_gate[-1].weight)\n nn.init.zeros_(self.resid_gate[-1].bias)\n\n if isinstance(self.output_projection, nn.Linear):\n nn.init.xavier_uniform_(self.output_projection.weight)\n if self.output_projection.bias is not None:\n self.output_projection.bias.zero_()\n\n def _moving_average(self, x, kernel_size):\n if kernel_size <= 1:\n return x\n pad = (kernel_size - 1) // 2\n x_t = x.transpose(1, 2)\n x_t = F.pad(x_t, (pad, pad), mode='replicate')\n x_t = F.avg_pool1d(x_t, kernel_size=kernel_size, stride=1)\n return x_t.transpose(1, 2)\n\n def _extract_trend(self, x):\n weights = torch.softmax(self.trend_kernel_logits, dim=0)\n trend = 0.0\n for w, k in zip(weights, self.trend_kernels):\n trend = trend + w * self._moving_average(x, k)\n return trend\n\n def _temporal_linear(self, x, linear):\n return linear(x.transpose(1, 2)).transpose(1, 2)\n\n def _periodic_residual(self, resid):\n bsz, seq_len, channels = resid.shape\n period_forecasts = []\n corr_scores = []\n\n for idx, p in enumerate(self.periods):\n n_cycles = max(1, seq_len // p)\n usable = n_cycles * p\n segment = resid[:, -usable:, :].contiguous()\n cycles = segment.reshape(bsz, n_cycles, p, channels)\n\n decay = F.softplus(self.period_decay[idx]) + 1e-4\n cycle_pos = torch.arange(n_cycles, device=resid.device, dtype=resid.dtype)\n cycle_w = torch.exp(-decay * (n_cycles - 1 - cycle_pos))\n cycle_w = cycle_w / cycle_w.sum().clamp_min(self.eps)\n template = (cycles * cycle_w.view(1, n_cycles, 1, 1)).sum(dim=1)\n\n repeat_factor = (self.pred_len + p - 1) // p + 1\n forecast = template.repeat(1, repeat_factor, 1)[:, :self.pred_len, :]\n period_forecasts.append(forecast)\n\n if p < seq_len:\n x1 = resid[:, p:, :]\n x2 = resid[:, :-p, :]\n numerator = (x1 * x2).mean(dim=(1, 2))\n denom = torch.sqrt(\n (x1.pow(2).mean(dim=(1, 2)) + self.eps) *\n (x2.pow(2).mean(dim=(1, 2)) + self.eps)\n )\n corr = numerator / denom.clamp_min(self.eps)\n else:\n corr = torch.zeros(bsz, device=resid.device, dtype=resid.dtype)\n corr_scores.append(corr)\n\n period_stack = torch.stack(period_forecasts, dim=1)\n corr_stack = torch.stack(corr_scores, dim=1)\n score_scale = F.softplus(self.period_logit_scale) + 1.0\n period_w = torch.softmax(score_scale * corr_stack + self.period_bias.view(1, -1), dim=1)\n mixed_period = (period_stack * period_w[:, :, None, None]).sum(dim=1)\n return mixed_period, corr_stack\n\n def _spectral_residual(self, resid):\n # resid: [B, T, C]\n signal = resid.transpose(1, 2) # [B, C, T]\n freq = torch.fft.rfft(signal, dim=-1)\n mag = freq.abs()\n if mag.shape[-1] <= 1:\n zeros = resid.new_zeros(resid.shape[0], self.pred_len, resid.shape[2])\n strength = resid.new_zeros(resid.shape[0], 1)\n return zeros, strength\n\n mag = mag.clone()\n mag[..., 0] = 0.0\n k = min(self.spectral_topk, mag.shape[-1] - 1)\n top_vals, top_idx = torch.topk(mag, k=k, dim=-1)\n selected = torch.gather(freq, dim=-1, index=top_idx)\n\n future_steps = torch.arange(\n self.seq_len, self.seq_len + self.pred_len,\n device=resid.device,\n dtype=resid.dtype\n ).view(1, 1, 1, self.pred_len)\n\n phase = 2.0 * math.pi * top_idx.to(resid.dtype).unsqueeze(-1) * future_steps / float(self.seq_len)\n basis = torch.complex(torch.cos(phase), torch.sin(phase))\n\n coeff_scale = torch.full_like(top_idx, 2.0, dtype=resid.dtype)\n if self.seq_len % 2 == 0:\n nyquist = (top_idx == (mag.shape[-1] - 1))\n coeff_scale = torch.where(nyquist, torch.ones_like(coeff_scale), coeff_scale)\n\n rank_scale = torch.sigmoid(self.spectral_rank[:k]).view(1, 1, k, 1)\n damp = 1.0 / (\n 1.0 + F.softplus(self.spectral_damping) * (top_idx.to(resid.dtype).unsqueeze(-1) / float(self.seq_len)).pow(2)\n )\n\n contribution = coeff_scale.unsqueeze(-1) * (selected.unsqueeze(-1) * basis).real / float(self.seq_len)\n spectral = (contribution * rank_scale * damp).sum(dim=-2) # [B, C, H]\n strength = (top_vals.sum(dim=-1) / mag.sum(dim=-1).clamp_min(self.eps)).mean(dim=1, keepdim=True)\n return spectral.transpose(1, 2), strength\n\n def _mark_residual(self, resid, x_mark_enc, x_mark_dec):\n if x_mark_enc is None or x_mark_dec is None:\n return resid.new_zeros(resid.shape[0], self.pred_len, resid.shape[2])\n if x_mark_enc.numel() == 0 or x_mark_dec.n\n... (truncated)",
5 "after_line": 0,
6 "start_line": 1,
7 "end_line": 41
8}Tool ResultStep 1
OK: Replaced lines 1..41 with 340 line(s) in Time-Series-Library/models/Custom.py. Editable range: entire file.
[Current file: Time-Series-Library/models/Custom.py | editable: 1–340 | total: 340 lines]
1 import math
2 import torch
3 import torch.nn as nn
4 import torch.nn.functional as F
5
6
7 class Model(nn.Module):
8 """
9 Spectral-Phase Mixture forecaster.
10
11 Core idea:
12 1) normalize around the last value to model short-term deltas,
13 2) extract a smooth trend with a learnable multi-kernel moving average,
14 3) model residual dynamics with a mixture of
15 - local linear correction,
16 - autocorrelation-weighted seasonal prototypes,
17 - FFT-based spectral continuation,
18 - phase retrieval from calendar/time features,
19 4) use sample-adaptive gates to combine the branches.
20 """
21
22 def _
AssistantStep 2test
Tool ResultStep 2
[Test #1] (2 tests remaining; call submit(n=1) to submit this result) ### m4_monthly (scripts/m4_monthly.sh) Training metrics (last epochs): TRAIN_METRICS epoch=5 train_loss=7.7243936 vali_loss=12.8945603 TRAIN_METRICS epoch=6 train_loss=7.6604371 vali_loss=12.8915765 TRAIN_METRICS epoch=7 train_loss=7.6509253 vali_loss=12.8983146 TRAIN_METRICS epoch=8 train_loss=7.6580415 vali_loss=12.8929502 TRAIN_METRICS epoch=9 train_loss=7.6745193 vali_loss=12.8924033 Test results (m4_monthly): SMAPE: 12.8916, MAPE: 12.8916, MAPE: 0.1526 ### m4_quarterly (scripts/m4_quarterly.sh) Training metrics (last epochs): TRAIN_METRICS epoch=6 train_loss=7.0763981 vali_loss=10.2148691 TRAIN_METRICS epoch=7 train_loss=7.0101592 vali_loss=10.2117707 TRAIN_METRICS epoch=8 train_loss=7.0858238 vali_loss=10.2093946 TRAIN_METRICS epoch=9 train_loss=7.1730375 vali_loss=10.2091106 TRAIN_METRICS epoch=10 train_loss=7.0757009 vali_loss=10.2084213 Test results (m4_quarterly): SMAPE: 10.2084, MAPE: 10.2084, MAPE:
AssistantStep 3edit
Tool Input
1{
2 "op": "replace",
3 "filename": "Time-Series-Library/models/Custom.py",
4 "content": "import math\nimport torch\nimport torch.nn as nn\nimport torch.nn.functional as F\n\n\nclass Model(nn.Module):\n \"\"\"\n Spectral-Phase Mixture forecaster.\n\n Core idea:\n 1) normalize around the last value to model short-term deltas,\n 2) extract a smooth trend with a learnable multi-kernel moving average,\n 3) model residual dynamics with a mixture of\n - local linear correction,\n - autocorrelation-weighted seasonal prototypes,\n - FFT-based spectral continuation,\n - phase retrieval from calendar/time features,\n 4) use sample-adaptive gates to combine the branches.\n\n The second version adds confidence-weighted residual activation:\n seasonal residual branches are only strongly activated when the input shows\n clear periodic evidence (autocorrelation or concentrated spectral power),\n which is especially important for weakly-seasonal yearly series.\n \"\"\"\n\n def __init__(self, configs):\n super(Model, self).__init__()\n self.task_name = configs.task_name\n self.seq_len = configs.seq_len\n self.pred_len = configs.pred_len\n self.enc_in = configs.enc_in\n self.c_out = configs.c_out\n self.eps = 1e-5\n\n # Global and trend branches.\n self.direct_linear = nn.Linear(self.seq_len, self.pred_len)\n self.trend_linear = nn.Linear(self.seq_len, self.pred_len)\n\n # Short local correction branch on the residual.\n self.local_len = min(self.seq_len, max(12, 2 * self.pred_len))\n self.local_linear = nn.Linear(self.local_len, self.pred_len)\n\n # Multi-scale trend extraction.\n self.trend_kernels = self._build_trend_kernels(self.seq_len)\n self.trend_kernel_logits = nn.Parameter(torch.zeros(len(self.trend_kernels)))\n\n # Adaptive seasonal prototype branch.\n self.periods = self._build_periods(self.seq_len, self.pred_len)\n self.num_periods = len(self.periods)\n self.period_bias = nn.Parameter(torch.zeros(self.num_periods))\n self.period_decay = nn.Parameter(torch.full((self.num_periods,), -2.0))\n self.period_logit_scale = nn.Parameter(torch.tensor(1.0))\n\n # Spectral continuation branch.\n self.spectral_topk = min(6, max(2, self.seq_len // 16 + 2))\n self.spectral_rank = nn.Parameter(torch.zeros(self.spectral_topk))\n self.spectral_damping = nn.Parameter(torch.tensor(-1.5))\n\n # Trend extrapolation helpers.\n self.trend_window = min(self.seq_len, max(12, 2 * self.pred_len))\n trend_grid = torch.arange(self.trend_window).float()\n trend_grid = trend_grid - trend_grid.mean()\n self.register_buffer(\n 'trend_grid', trend_grid.view(1, self.trend_window, 1), persistent=False\n )\n self.register_buffer(\n 'trend_grid_denom', torch.tensor(float((trend_grid ** 2).sum().item() + 1e-6)), persistent=False\n )\n self.register_buffer(\n 'horizon', torch.arange(1, self.pred_len + 1).float().view(1, self.pred_len, 1), persistent=False\n )\n self.trend_slope_weight = nn.Parameter(torch.tensor(0.5))\n\n # Phase retrieval branch using raw time features.\n self.mark_temperature = nn.Parameter(torch.tensor(-1.0))\n self.mark_recency = nn.Parameter(torch.tensor(0.0))\n\n # Sample-adaptive gates.\n gate_dim = 6 + self.num_periods\n gate_hidden = max(16, min(64, getattr(configs, 'd_model', 32)))\n self.base_gate = nn.Sequential(\n nn.Linear(gate_dim, gate_hidden),\n nn.GELU(),\n nn.Linear(gate_hidden, 2)\n )\n self.seasonal_gate = nn.Sequential(\n nn.Linear(gate_dim, gate_hidden),\n nn.GELU(),\n nn.Linear(gate_hidden, 3)\n )\n self.local_residual_scale = nn.Parameter(torch.tensor(-1.0))\n self.residual_scale = nn.Parameter(torch.tensor(-1.0))\n\n # Optional output projection when c_out != enc_in.\n self.output_projection = nn.Identity()\n if self.c_out != self.enc_in:\n self.output_projection = nn.Linear(self.enc_in, self.c_out)\n\n self._reset_parameters()\n\n @staticmethod\n def _build_trend_kernels(seq_len):\n kernels = []\n for k in [3, 5, 7, 11, 17]:\n if k < seq_len:\n kernels.append(k if k % 2 == 1 else k + 1)\n if not kernels:\n kernels = [1]\n return kernels\n\n @staticmethod\n def _build_periods(seq_len, pred_len):\n candidates = {\n 1, 2, 3, 4, 6, 8, 12, 24,\n max(1, pred_len // 2),\n pred_len,\n min(seq_len - 1, pred_len * 2),\n }\n periods = []\n for p in sorted(candidates):\n if p <= 0 or p >= seq_len:\n continue\n if (seq_len // p) >= 2 or p <= pred_len:\n periods.append(p)\n return periods if periods else [1]\n\n def _reset_parameters(self):\n with torch.no_grad():\n self.direct_linear.weight.fill_(1.0 / max(self.seq_len, 1))\n self.direct_linear.bias.zero_()\n self.trend_linear.weight.fill_(1.0 / max(self.seq_len, 1))\n self.trend_linear.bias.zero_()\n self.local_linear.weight.zero_()\n self.local_linear.bias.zero_()\n\n nn.init.zeros_(self.base_gate[-1].weight)\n self.base_gate[-1].bias.copy_(torch.tensor([-0.25, 0.25]))\n\n nn.init.zeros_(self.seasonal_gate[-1].weight)\n self.seasonal_gate[-1].bias.copy_(torch.tensor([0.4, 0.1, -0.3]))\n\n if isinstance(self.output_projection, nn.Linear):\n nn.init.xavier_uniform_(self.output_projection.weight)\n if self.output_projection.bias is not None:\n self.output_projection.bias.zero_()\n\n def _moving_average(self, x, kernel_size):\n if kernel_size <= 1:\n return x\n pad = (kernel_size - 1) // 2\n x_t = x.transpose(1, 2)\n x_t = F.pad(x_t, (pad, pad), mode='replicate')\n x_t = F.avg_pool1d(x_t, kernel_size=kernel_size, stride=1)\n return x_t.transpose(1, 2)\n\n def _extract_trend(self, x):\n weights = torch.softmax(self.trend_kernel_logits, dim=0)\n trend = 0.0\n for w, k in zip(weights, self.trend_kernels):\n trend = trend + w * self._moving_average(x, k)\n return trend\n\n def _temporal_linear(self, x, linear):\n return linear(x.transpose(1, 2)).transpose(1, 2)\n\n def _periodic_residual(self, resid):\n bsz, seq_len, channels = resid.shape\n period_forecasts = []\n corr_scores = []\n\n for idx, p in enumerate(self.periods):\n n_cycles = max(1, seq_len // p)\n usable = n_cycles * p\n segment = resid[:, -usable:, :].contiguous()\n cycles = segment.reshape(bsz, n_cycles, p, channels)\n\n decay = F.softplus(self.period_decay[idx]) + 1e-4\n cycle_pos = torch.arange(n_cycles, device=resid.device, dtype=resid.dtype)\n cycle_w = torch.exp(-decay * (n_cycles - 1 - cycle_pos))\n cycle_w = cycle_w / cycle_w.sum().clamp_min(self.eps)\n template = (cycles * cycle_w.view(1, n_cycles, 1, 1)).sum(dim=1)\n\n repeat_factor = (self.pred_len + p - 1) // p + 1\n forecast = template.repeat(1, repeat_factor, 1)[:, :self.pred_len, :]\n period_forecasts.append(forecast)\n\n if p < seq_len:\n x1 = resid[:, p:, :]\n x2 = resid[:, :-p, :]\n numerator = (x1 * x2).mean(dim=(1, 2))\n denom = torch.sqrt(\n (x1.pow(2).mean(dim=(1, 2)) + self.eps) *\n (x2.pow(2).mean(dim=(1, 2)) + self.eps)\n )\n corr = numerator / denom.clamp_min(self.eps)\n else:\n corr = torch.zeros(bsz, device=resid.device, dtype=resid.dtype)\n corr_scores.append(corr)\n\n period_stack = torch.stack(period_forecasts, dim=1)\n corr_stack = torch.stack(corr_scores, dim=1)\n score_scale = F.softplus(self.period_logit_scale) + 1.0\n period_w = torch.softmax(score_scale * corr_stack + self.period_bias.view(1, -1), dim=1)\n mixed_period = (period_stack * period_w[:, :, None, None]).sum(dim=1)\n return mixed_period, corr_stack\n\n def _spectral_residual(self, resid):\n # resid: [B, T, C]\n signal = resid.transpose(1, 2) # [B, C, T]\n freq = torch.fft.rfft(signal, dim=-1)\n mag = freq.abs()\n if mag.shape[-1] <= 1:\n zeros = resid.new_zeros(resid.shape[0], self.pred_len, resid.shape[2])\n strength = resid.new_zeros(resid.shape[0], 1)\n return zeros, strength\n\n mag = mag.clone()\n mag[..., 0] = 0.0\n k = min(self.spectral_topk, mag.shape[-1] - 1)\n top_vals, top_idx = torch.topk(mag, k=k, dim=-1)\n selected = torch.gather(freq, dim=-1, index=top_idx)\n\n future_steps = torch.arange(\n self.seq_len, self.seq_len + self.pred_len,\n device=resid.device,\n dtype=resid.dtype\n ).view(1, 1, 1, self.pred_len)\n\n phase = 2.0 * math.pi * top_idx.to(resid.dtype).unsqueeze(-1) * future_steps / float(self.seq_len)\n basis = torch.complex(torch.cos(phase), torch.sin(phase))\n\n coeff_scale = torch.full_like(top_idx, 2.0, dtype=resid.dtype)\n if self.seq_len % 2 == 0:\n nyquist = (top_idx == (mag.shape[-1] - 1))\n coeff_scale = torch.where(nyquist, torch.ones_like(coeff_scale), coeff_scale)\n\n rank_scale = torch.sigmoid(self.spectral_rank[:k]).view(1, 1, k, 1)\n damp = 1.0 / (\n 1.0 + F.softplus(self.spectral_damping) * (top_idx.to(resid.dtype).unsqueeze(-1) / float(self.seq_len)).pow(2)\n )\n\n contribution = coeff_scale.unsqueeze(-1) * (selected.unsqueeze(-1) * basis).real / float(self.seq_len)\n spectral = (contribution * rank_scale * damp).sum(dim=-2)\n... (truncated)",
5 "after_line": 0,
6 "start_line": 1,
7 "end_line": 340
8}Tool ResultStep 3
OK: Replaced lines 1..340 with 357 line(s) in Time-Series-Library/models/Custom.py. Editable range: entire file.
[Current file: Time-Series-Library/models/Custom.py | editable: 1–357 | total: 357 lines]
1 import math
2 import torch
3 import torch.nn as nn
4 import torch.nn.functional as F
5
6
7 class Model(nn.Module):
8 """
9 Spectral-Phase Mixture forecaster.
10
11 Core idea:
12 1) normalize around the last value to model short-term deltas,
13 2) extract a smooth trend with a learnable multi-kernel moving average,
14 3) model residual dynamics with a mixture of
15 - local linear correction,
16 - autocorrelation-weighted seasonal prototypes,
17 - FFT-based spectral continuation,
18 - phase retrieval from calendar/time features,
19 4) use sample-adaptive gates to combine the branches.
20
21 The second version a
AssistantStep 4test
Tool ResultStep 4
[Test #2] (1 test remaining; call submit(n=2) to submit this result) ### m4_monthly (scripts/m4_monthly.sh) Training metrics (last epochs): TRAIN_METRICS epoch=6 train_loss=7.7265321 vali_loss=12.9496999 TRAIN_METRICS epoch=7 train_loss=7.7105687 vali_loss=12.9217496 TRAIN_METRICS epoch=8 train_loss=7.7061868 vali_loss=12.9313660 TRAIN_METRICS epoch=9 train_loss=7.7482374 vali_loss=12.9236192 TRAIN_METRICS epoch=10 train_loss=7.7214334 vali_loss=12.9254383 Test results (m4_monthly): SMAPE: 12.9217, MAPE: 12.9217, MAPE: 0.1536 ### m4_quarterly (scripts/m4_quarterly.sh) Training metrics (last epochs): TRAIN_METRICS epoch=6 train_loss=7.0951350 vali_loss=10.2446976 TRAIN_METRICS epoch=7 train_loss=7.0474038 vali_loss=10.2277858 TRAIN_METRICS epoch=8 train_loss=7.1529060 vali_loss=10.2211119 TRAIN_METRICS epoch=9 train_loss=7.0549898 vali_loss=10.2208495 TRAIN_METRICS epoch=10 train_loss=7.1651919 vali_loss=10.2221787 Test results (m4_quarterly): SMAPE: 10.2209, MAPE: 10.2209, MAPE:
AssistantStep 5submit
Tool Input
1{
2 "n": 1
3}Tool ResultStep 5
[submit] Submitting result from test #1 as final. ### m4_monthly (scripts/m4_monthly.sh) Training metrics (last epochs): TRAIN_METRICS epoch=5 train_loss=7.7243936 vali_loss=12.8945603 TRAIN_METRICS epoch=6 train_loss=7.6604371 vali_loss=12.8915765 TRAIN_METRICS epoch=7 train_loss=7.6509253 vali_loss=12.8983146 TRAIN_METRICS epoch=8 train_loss=7.6580415 vali_loss=12.8929502 TRAIN_METRICS epoch=9 train_loss=7.6745193 vali_loss=12.8924033 Test results (m4_monthly): SMAPE: 12.8916, MAPE: 12.8916, MAPE: 0.1526 ### m4_quarterly (scripts/m4_quarterly.sh) Training metrics (last epochs): TRAIN_METRICS epoch=6 train_loss=7.0763981 vali_loss=10.2148691 TRAIN_METRICS epoch=7 train_loss=7.0101592 vali_loss=10.2117707 TRAIN_METRICS epoch=8 train_loss=7.0858238 vali_loss=10.2093946 TRAIN_METRICS epoch=9 train_loss=7.1730375 vali_loss=10.2091106 TRAIN_METRICS epoch=10 train_loss=7.0757009 vali_loss=10.2084213 Test results (m4_quarterly): SMAPE: 10.2084, MAPE: 10.2084, MAPE: 0.1176 ### m4_yearl