Agent Conversation: openai/gpt-5.4-pro
Model: openai/gpt-5.4-pro
Total steps: 5
Messages: 11
UserStep 0
Task: ts-anomaly-detection
Time Series Anomaly Detection: Custom Model Design
Objective
Design and implement a custom deep learning model for unsupervised time series anomaly detection via reconstruction. Your code goes in the Model class in models/Custom.py. Three reference implementations (DLinear, TimesNet, PatchTST) are provided as read-only.
Evaluation
Trained and evaluated on three anomaly detection datasets:
- PSM (25 variables, server machine dataset)
- MSL (55 variables, Mars Science Laboratory)
- SMAP (25 variables, Soil Moisture Active Passive satellite)
All use seq_len=100, anomaly_ratio=1. Metric: F-score (higher is better).
Time-Series-Library/models/Custom.py [EDITABLE — entire file only]
1: import torch
2: import torch.nn as nn
3:
4:
5: class Model(nn.Module):
6: """
7: Custom model for time series anomaly detection.
8:
9: Forward signature: forward(x_enc, x_mark_enc, x_dec, x_mark_dec, mask=None)
10: - x_enc: [batch, seq_len, enc_in] — input time series
11: - x_mark_enc: not used for anomaly detection (None)
12: - x_dec: not used for anomaly detection (None)
13: - x_mark_dec: not used for anomaly detection (None)
14:
15: Must return: [batch, seq_len, c_out] — reconstructed sequence
16: The framework computes MSE between input and output for anomaly scoring.
17: """
18:
19: def __init__(self, configs):
20: super(Model, self).__init__()
21: self.task_name = configs.task_name
22: self.seq_len = configs.seq_len
23: self.pred_len = configs.seq_len # For anomaly detection, pred_len = seq_len
24: self.enc_in = configs.enc_in
25: self.c_out = configs.c_out
26: # TODO: Define your model architecture here
27:
28: def anomaly_detection(self, x_enc):
29: """
30: Anomaly detection: reconstruct the input sequence.
31: Input: x_enc [batch, seq_len, enc_in]
32: Output: [batch, seq_len, c_out]
33: """
34: # TODO: Implement your reconstruction logic
35: return x_enc # Placeholder: identity reconstruction
36:
37: def forward(self, x_enc, x_mark_enc, x_dec, x_mark_dec, mask=None):
38: if self.task_name == 'anomaly_detection':
39: dec_out = self.anomaly_detection(x_enc)
40: return dec_out
41: return None
Time-Series-Library/models/DLinear.py [READ-ONLY — do not edit]
1: import torch
2: import torch.nn as nn
3: import torch.nn.functional as F
4: from layers.Autoformer_EncDec import series_decomp
5:
6:
7: class Model(nn.Module):
8: """
9: Paper link: https://arxiv.org/pdf/2205.13504.pdf
10: """
11:
12: def __init__(self, configs, iAssistantStep 1edit
Tool Input
1{
2 "op": "replace",
3 "filename": "Time-Series-Library/models/Custom.py",
4 "content": "import math\nimport torch\nimport torch.nn as nn\nimport torch.nn.functional as F\n\n\nclass MaskedDepthwiseConv1d(nn.Module):\n \"\"\"Depthwise convolution with the center weight removed (blind-spot conv).\"\"\"\n\n def __init__(self, channels, kernel_size):\n super().__init__()\n if kernel_size % 2 == 0:\n raise ValueError(\"kernel_size must be odd for blind-spot convolution\")\n self.channels = channels\n self.kernel_size = kernel_size\n self.weight = nn.Parameter(torch.empty(channels, 1, kernel_size))\n self.bias = nn.Parameter(torch.zeros(channels))\n\n mask = torch.ones(1, 1, kernel_size)\n mask[..., kernel_size // 2] = 0.0\n self.register_buffer('mask', mask)\n nn.init.kaiming_uniform_(self.weight, a=math.sqrt(5))\n\n def forward(self, x):\n # x: [B, C, L]\n pad = self.kernel_size // 2\n x = F.pad(x, (pad, pad), mode='replicate')\n weight = self.weight * self.mask\n return F.conv1d(x, weight, bias=self.bias, groups=self.channels)\n\n\nclass LeaveOneOutCrossBlock(nn.Module):\n \"\"\"Query attends to all timestamps except itself via a diagonal mask.\"\"\"\n\n def __init__(self, d_model, n_heads, d_ff, dropout):\n super().__init__()\n self.q_norm = nn.LayerNorm(d_model)\n self.kv_norm = nn.LayerNorm(d_model)\n self.attn = nn.MultiheadAttention(\n embed_dim=d_model,\n num_heads=n_heads,\n dropout=dropout,\n batch_first=True,\n )\n self.dropout = nn.Dropout(dropout)\n self.ff_norm = nn.LayerNorm(d_model)\n self.ff = nn.Sequential(\n nn.Linear(d_model, d_ff),\n nn.GELU(),\n nn.Dropout(dropout),\n nn.Linear(d_ff, d_model),\n nn.Dropout(dropout),\n )\n\n def forward(self, q, kv, attn_mask):\n q_norm = self.q_norm(q)\n kv_norm = self.kv_norm(kv)\n attn_out, _ = self.attn(\n q_norm,\n kv_norm,\n kv_norm,\n attn_mask=attn_mask,\n need_weights=False,\n )\n q = q + self.dropout(attn_out)\n q = q + self.ff(self.ff_norm(q))\n return q\n\n\nclass PrototypeMemory(nn.Module):\n \"\"\"Sharpened prototype projection to keep reconstructions on a normal-pattern manifold.\"\"\"\n\n def __init__(self, d_model, memory_size=32, dropout=0.1, temperature=0.2):\n super().__init__()\n self.temperature = temperature\n self.memory = nn.Parameter(torch.randn(memory_size, d_model) / math.sqrt(d_model))\n self.post = nn.Sequential(\n nn.Linear(2 * d_model, d_model),\n nn.GELU(),\n nn.Dropout(dropout),\n nn.Linear(d_model, d_model),\n )\n\n def forward(self, x):\n # x: [B, L, D]\n x_norm = F.normalize(x, dim=-1)\n mem_norm = F.normalize(self.memory, dim=-1)\n scores = torch.matmul(x_norm, mem_norm.transpose(0, 1)) / self.temperature\n weights = torch.softmax(scores, dim=-1)\n # self-sharpening without adding an auxiliary loss\n weights = weights.pow(2)\n weights = weights / (weights.sum(dim=-1, keepdim=True) + 1e-6)\n proto = torch.matmul(weights, self.memory)\n return self.post(torch.cat([x, proto], dim=-1))\n\n\nclass Model(nn.Module):\n \"\"\"\n Consensus-Routed Leave-One-Out Reconstructor (COLOR).\n\n Key ideas:\n 1) strict leave-one-out reconstruction: timestamp t is reconstructed without using x_t\n 2) consensus fusion across multiple blind-spot scales\n 3) prototype memory projection to bias reconstruction toward normal patterns\n \"\"\"\n\n def __init__(self, configs):\n super(Model, self).__init__()\n self.task_name = configs.task_name\n self.seq_len = configs.seq_len\n self.pred_len = configs.seq_len # For anomaly detection, pred_len = seq_len\n self.enc_in = configs.enc_in\n self.c_out = configs.c_out\n\n self.d_model = getattr(configs, 'd_model', 64)\n requested_heads = getattr(configs, 'n_heads', 4)\n self.n_heads = self._valid_heads(self.d_model, requested_heads)\n self.d_ff = getattr(configs, 'd_ff', max(128, self.d_model * 2))\n self.e_layers = max(1, getattr(configs, 'e_layers', 2))\n self.dropout = getattr(configs, 'dropout', 0.1)\n\n self.trend_kernels = self._build_kernels([5, 11, 21])\n self.local_kernels = self._build_kernels([3, 7, 15])\n self.consensus_temperature = 0.2\n\n for kernel in self.trend_kernels:\n avg_weight = torch.ones(self.enc_in, 1, kernel) / float(kernel - 1)\n avg_weight[:, :, kernel // 2] = 0.0\n self.register_buffer(f'trend_avg_{kernel}', avg_weight)\n\n self.local_convs = nn.ModuleList(\n [MaskedDepthwiseConv1d(self.enc_in, kernel) for kernel in self.local_kernels]\n )\n\n trend_width = self.enc_in * (len(self.trend_kernels) + 1)\n local_width = self.enc_in * (len(self.local_kernels) + 1)\n\n self.trend_embed = nn.Sequential(\n nn.Linear(trend_width, self.d_model),\n nn.GELU(),\n nn.Dropout(self.dropout),\n nn.LayerNorm(self.d_model),\n )\n self.trend_decoder = nn.Sequential(\n nn.Linear(self.d_model, self.d_ff),\n nn.GELU(),\n nn.Dropout(self.dropout),\n nn.Linear(self.d_ff, self.c_out),\n )\n\n self.local_proj = nn.Conv1d(local_width, self.d_model, kernel_size=1)\n self.local_norm = nn.LayerNorm(self.d_model)\n self.kv_proj = nn.Linear(self.enc_in, self.d_model)\n self.input_dropout = nn.Dropout(self.dropout)\n\n self.pos_emb = nn.Parameter(torch.zeros(1, self.seq_len, self.d_model))\n nn.init.trunc_normal_(self.pos_emb, std=0.02)\n\n self.context_blocks = nn.ModuleList(\n [LeaveOneOutCrossBlock(self.d_model, self.n_heads, self.d_ff, self.dropout) for _ in range(self.e_layers)]\n )\n self.memory = PrototypeMemory(self.d_model, memory_size=32, dropout=self.dropout, temperature=0.2)\n\n self.context_proj = nn.Sequential(\n nn.Linear(3 * self.d_model, self.d_model),\n nn.GELU(),\n nn.Dropout(self.dropout),\n nn.LayerNorm(self.d_model),\n )\n self.delta_decoder = nn.Sequential(\n nn.Linear(self.d_model, self.d_ff),\n nn.GELU(),\n nn.Dropout(self.dropout),\n nn.Linear(self.d_ff, self.c_out),\n )\n self.gate_decoder = nn.Sequential(\n nn.Linear(3 * self.d_model, self.d_model),\n nn.GELU(),\n nn.Dropout(self.dropout),\n nn.Linear(self.d_model, self.c_out),\n )\n\n nn.init.zeros_(self.delta_decoder[-1].weight)\n nn.init.zeros_(self.delta_decoder[-1].bias)\n nn.init.constant_(self.gate_decoder[-1].bias, -1.0)\n\n self.register_buffer('diag_mask', torch.eye(self.seq_len, dtype=torch.bool))\n\n def _valid_heads(self, d_model, requested_heads):\n requested_heads = max(1, min(d_model, requested_heads))\n for heads in range(requested_heads, 0, -1):\n if d_model % heads == 0:\n return heads\n return 1\n\n def _build_kernels(self, base_kernels):\n kernels = []\n max_kernel = self.seq_len if self.seq_len % 2 == 1 else self.seq_len - 1\n max_kernel = max(3, max_kernel)\n for kernel in base_kernels:\n kernel = min(kernel, max_kernel)\n if kernel % 2 == 0:\n kernel -= 1\n kernel = max(3, kernel)\n if kernel not in kernels:\n kernels.append(kernel)\n return kernels\n\n def _consensus_fuse(self, tensors):\n # tensors: list of [B, C, L]\n stack = torch.stack(tensors, dim=-1) # [B, C, L, S]\n consensus = stack.mean(dim=-1, keepdim=True)\n disagreement = (stack - consensus).pow(2).mean(dim=(1, 2))\n weights = torch.softmax(-disagreement / self.consensus_temperature, dim=-1)\n fused = torch.sum(stack * weights[:, None, None, :], dim=-1)\n return fused\n\n def _exclude_center_average(self, x_cf, kernel):\n # x_cf: [B, C, L]\n pad = kernel // 2\n weight = getattr(self, f'trend_avg_{kernel}').to(dtype=x_cf.dtype)\n x_pad = F.pad(x_cf, (pad, pad), mode='replicate')\n return F.conv1d(x_pad, weight, groups=self.enc_in)\n\n def _denorm(self, x, means, stdev):\n if means.size(-1) == x.size(-1):\n return x * stdev + means\n # Fallback for unexpected c_out != enc_in settings.\n if means.size(-1) > x.size(-1):\n return x * stdev[..., :x.size(-1)] + means[..., :x.size(-1)]\n mean = means.mean(dim=-1, keepdim=True).expand_as(x)\n std = stdev.mean(dim=-1, keepdim=True).expand_as(x)\n return x * std + mean\n\n def anomaly_detection(self, x_enc):\n \"\"\"\n Anomaly detection: reconstruct the input sequence.\n Input: x_enc [batch, seq_len, enc_in]\n Output: [batch, seq_len, c_out]\n \"\"\"\n means = x_enc.mean(dim=1, keepdim=True).detach()\n x = x_enc - means\n stdev = torch.sqrt(torch.var(x, dim=1, keepdim=True, unbiased=False) + 1e-5).detach()\n x = x / stdev\n\n bsz, length, _ = x.shape\n pos = self.pos_emb[:, :length, :]\n diag_mask = self.diag_mask[:length, :length].to(x.device)\n\n # ----- center-excluding multi-scale trend branch -----\n x_cf = x.transpose(1, 2) # [B, C, L]\n trend_candidates = [self._exclude_center_average(x_cf, kernel) for kernel in self.trend_kernels]\n trend_fused = self._consensus_fuse(trend_candidates)\n trend_features = torch.cat(trend_candidates + [trend_fused], dim=1).transpose(1, 2)\n trend_hidden = self.trend_embed(trend_features)\n trend_pred = self.trend_decoder(trend_hidden)\n\n # ----- learnable local blind-spot context -----\n local_candidates = [conv(x_cf) for conv in self.local_\n... (truncated)",
5 "after_line": 41,
6 "start_line": 1,
7 "end_line": 41
8}Tool ResultStep 1
OK: Replaced lines 1..41 with 284 line(s) in Time-Series-Library/models/Custom.py. Editable range: entire file.
[Current file: Time-Series-Library/models/Custom.py | editable: 1–284 | total: 284 lines]
1 import math
2 import torch
3 import torch.nn as nn
4 import torch.nn.functional as F
5
6
7 class MaskedDepthwiseConv1d(nn.Module):
8 """Depthwise convolution with the center weight removed (blind-spot conv)."""
9
10 def __init__(self, channels, kernel_size):
11 super().__init__()
12 if kernel_size % 2 == 0:
13 raise ValueError("kernel_size must be odd for blind-spot convolution")
14 self.channels = channels
15 self.kernel_size = kernel_size
16 self.weight = nn.Parameter(torch.empty(channels, 1, kernel_size))
17 self.bias = nn.Parameter(torch.zeros(channels))
18
19 mask = torch.ones(1, 1, kernel_size)
AssistantStep 2test
Tool ResultStep 2
[Test #1] (2 tests remaining; call submit(n=1) to submit this result) ## Seed 42 ### PSM (scripts/psm.sh) Training metrics (last epochs): TRAIN_METRICS epoch=1 train_loss=0.0457638 vali_loss=0.0492263 test_loss=0.0506896 TRAIN_METRICS epoch=2 train_loss=0.0362675 vali_loss=0.0432038 test_loss=0.0407345 TRAIN_METRICS epoch=3 train_loss=0.0314626 vali_loss=0.0360171 test_loss=0.0349524 Test results (PSM): F-score: 0.9679, Precision: 0.9859, Recall: 0.9506, Accuracy: 0.9825 ### MSL (scripts/msl.sh) Training metrics (last epochs): TRAIN_METRICS epoch=1 train_loss=0.2824120 vali_loss=0.2324335 test_loss=1.4687328 TRAIN_METRICS epoch=2 train_loss=0.1555498 vali_loss=0.1592765 test_loss=1.4163752 TRAIN_METRICS epoch=3 train_loss=0.1103255 vali_loss=0.1044642 test_loss=1.4349834 Test results (MSL): F-score: 0.8032, Precision: 0.8617, Recall: 0.7522, Accuracy: 0.9611 ## Seed 123 ### PSM (scripts/psm.sh) Training metrics (last epochs): TRAIN_METRICS epoch=1 train_loss=0.0448808 vali_l
AssistantStep 3edit
Tool Input
1{
2 "op": "replace",
3 "filename": "Time-Series-Library/models/Custom.py",
4 "content": "import math\nimport torch\nimport torch.nn as nn\nimport torch.nn.functional as F\n\n\nclass MaskedDepthwiseConv1d(nn.Module):\n \"\"\"Depthwise convolution with the center weight removed (blind-spot conv).\"\"\"\n\n def __init__(self, channels, kernel_size):\n super().__init__()\n if kernel_size % 2 == 0:\n raise ValueError(\"kernel_size must be odd for blind-spot convolution\")\n self.channels = channels\n self.kernel_size = kernel_size\n self.weight = nn.Parameter(torch.empty(channels, 1, kernel_size))\n self.bias = nn.Parameter(torch.zeros(channels))\n\n mask = torch.ones(1, 1, kernel_size)\n mask[..., kernel_size // 2] = 0.0\n self.register_buffer('mask', mask)\n nn.init.kaiming_uniform_(self.weight, a=math.sqrt(5))\n\n def forward(self, x):\n # x: [B, C, L]\n pad = self.kernel_size // 2\n x = F.pad(x, (pad, pad), mode='replicate')\n weight = self.weight * self.mask\n return F.conv1d(x, weight, bias=self.bias, groups=self.channels)\n\n\nclass LeaveOneOutCrossBlock(nn.Module):\n \"\"\"Query attends to all timestamps except itself via a diagonal mask.\"\"\"\n\n def __init__(self, d_model, n_heads, d_ff, dropout):\n super().__init__()\n self.q_norm = nn.LayerNorm(d_model)\n self.kv_norm = nn.LayerNorm(d_model)\n self.attn = nn.MultiheadAttention(\n embed_dim=d_model,\n num_heads=n_heads,\n dropout=dropout,\n batch_first=True,\n )\n self.dropout = nn.Dropout(dropout)\n self.ff_norm = nn.LayerNorm(d_model)\n self.ff = nn.Sequential(\n nn.Linear(d_model, d_ff),\n nn.GELU(),\n nn.Dropout(dropout),\n nn.Linear(d_ff, d_model),\n nn.Dropout(dropout),\n )\n\n def forward(self, q, kv, attn_mask):\n q_norm = self.q_norm(q)\n kv_norm = self.kv_norm(kv)\n attn_out, _ = self.attn(\n q_norm,\n kv_norm,\n kv_norm,\n attn_mask=attn_mask,\n need_weights=False,\n )\n q = q + self.dropout(attn_out)\n q = q + self.ff(self.ff_norm(q))\n return q\n\n\nclass PrototypeMemory(nn.Module):\n \"\"\"Sharpened prototype projection to keep reconstructions on a normal-pattern manifold.\"\"\"\n\n def __init__(self, d_model, memory_size=32, dropout=0.1, temperature=0.2):\n super().__init__()\n self.temperature = temperature\n self.memory = nn.Parameter(torch.randn(memory_size, d_model) / math.sqrt(d_model))\n self.post = nn.Sequential(\n nn.Linear(2 * d_model, d_model),\n nn.GELU(),\n nn.Dropout(dropout),\n nn.Linear(d_model, d_model),\n )\n\n def forward(self, x):\n # x: [B, L, D]\n x_norm = F.normalize(x, dim=-1)\n mem_norm = F.normalize(self.memory, dim=-1)\n scores = torch.matmul(x_norm, mem_norm.transpose(0, 1)) / self.temperature\n weights = torch.softmax(scores, dim=-1)\n # self-sharpening without adding an auxiliary loss\n weights = weights.pow(2)\n weights = weights / (weights.sum(dim=-1, keepdim=True) + 1e-6)\n proto_raw = torch.matmul(weights, self.memory)\n proto_refined = self.post(torch.cat([x, proto_raw], dim=-1))\n return proto_raw, proto_refined\n\n\nclass Model(nn.Module):\n \"\"\"\n Consensus-Routed Leave-One-Out Reconstructor (COLOR).\n\n Key ideas:\n 1) strict leave-one-out reconstruction: timestamp t is reconstructed without using x_t\n 2) consensus fusion across multiple blind-spot scales\n 3) prototype memory projection to bias reconstruction toward normal patterns\n 4) memory-disagreement confidence routing: large context-prototype mismatch suppresses refinement\n \"\"\"\n\n def __init__(self, configs):\n super(Model, self).__init__()\n self.task_name = configs.task_name\n self.seq_len = configs.seq_len\n self.pred_len = configs.seq_len # For anomaly detection, pred_len = seq_len\n self.enc_in = configs.enc_in\n self.c_out = configs.c_out\n\n self.d_model = getattr(configs, 'd_model', 64)\n requested_heads = getattr(configs, 'n_heads', 4)\n self.n_heads = self._valid_heads(self.d_model, requested_heads)\n self.d_ff = getattr(configs, 'd_ff', max(128, self.d_model * 2))\n self.e_layers = max(1, getattr(configs, 'e_layers', 2))\n self.dropout = getattr(configs, 'dropout', 0.1)\n\n self.trend_kernels = self._build_kernels([5, 11, 21])\n self.local_kernels = self._build_kernels([3, 7, 15])\n self.consensus_temperature = 0.2\n\n for kernel in self.trend_kernels:\n avg_weight = torch.ones(self.enc_in, 1, kernel) / float(kernel - 1)\n avg_weight[:, :, kernel // 2] = 0.0\n self.register_buffer(f'trend_avg_{kernel}', avg_weight)\n\n self.local_convs = nn.ModuleList(\n [MaskedDepthwiseConv1d(self.enc_in, kernel) for kernel in self.local_kernels]\n )\n\n trend_width = self.enc_in * (len(self.trend_kernels) + 1)\n local_width = self.enc_in * (len(self.local_kernels) + 1)\n\n self.trend_embed = nn.Sequential(\n nn.Linear(trend_width, self.d_model),\n nn.GELU(),\n nn.Dropout(self.dropout),\n nn.LayerNorm(self.d_model),\n )\n self.trend_decoder = nn.Sequential(\n nn.Linear(self.d_model, self.d_ff),\n nn.GELU(),\n nn.Dropout(self.dropout),\n nn.Linear(self.d_ff, self.c_out),\n )\n\n self.local_proj = nn.Conv1d(local_width, self.d_model, kernel_size=1)\n self.local_norm = nn.LayerNorm(self.d_model)\n self.kv_proj = nn.Linear(self.enc_in, self.d_model)\n self.input_dropout = nn.Dropout(self.dropout)\n\n self.pos_emb = nn.Parameter(torch.zeros(1, self.seq_len, self.d_model))\n nn.init.trunc_normal_(self.pos_emb, std=0.02)\n\n self.context_blocks = nn.ModuleList(\n [LeaveOneOutCrossBlock(self.d_model, self.n_heads, self.d_ff, self.dropout) for _ in range(self.e_layers)]\n )\n self.memory = PrototypeMemory(self.d_model, memory_size=32, dropout=self.dropout, temperature=0.2)\n\n self.context_proj = nn.Sequential(\n nn.Linear(3 * self.d_model, self.d_model),\n nn.GELU(),\n nn.Dropout(self.dropout),\n nn.LayerNorm(self.d_model),\n )\n self.delta_decoder = nn.Sequential(\n nn.Linear(self.d_model, self.d_ff),\n nn.GELU(),\n nn.Dropout(self.dropout),\n nn.Linear(self.d_ff, self.c_out),\n )\n self.gate_decoder = nn.Sequential(\n nn.Linear(3 * self.d_model, self.d_model),\n nn.GELU(),\n nn.Dropout(self.dropout),\n nn.Linear(self.d_model, self.c_out),\n )\n\n nn.init.constant_(self.gate_decoder[-1].bias, -0.5)\n\n self.register_buffer('diag_mask', torch.eye(self.seq_len, dtype=torch.bool))\n\n def _valid_heads(self, d_model, requested_heads):\n requested_heads = max(1, min(d_model, requested_heads))\n for heads in range(requested_heads, 0, -1):\n if d_model % heads == 0:\n return heads\n return 1\n\n def _build_kernels(self, base_kernels):\n kernels = []\n max_kernel = self.seq_len if self.seq_len % 2 == 1 else self.seq_len - 1\n max_kernel = max(3, max_kernel)\n for kernel in base_kernels:\n kernel = min(kernel, max_kernel)\n if kernel % 2 == 0:\n kernel -= 1\n kernel = max(3, kernel)\n if kernel not in kernels:\n kernels.append(kernel)\n return kernels\n\n def _consensus_fuse(self, tensors):\n # tensors: list of [B, C, L]\n stack = torch.stack(tensors, dim=-1) # [B, C, L, S]\n consensus = stack.mean(dim=-1, keepdim=True)\n disagreement = (stack - consensus).pow(2).mean(dim=(1, 2))\n weights = torch.softmax(-disagreement / self.consensus_temperature, dim=-1)\n fused = torch.sum(stack * weights[:, None, None, :], dim=-1)\n return fused\n\n def _exclude_center_average(self, x_cf, kernel):\n # x_cf: [B, C, L]\n pad = kernel // 2\n weight = getattr(self, f'trend_avg_{kernel}').to(dtype=x_cf.dtype)\n x_pad = F.pad(x_cf, (pad, pad), mode='replicate')\n return F.conv1d(x_pad, weight, groups=self.enc_in)\n\n def _denorm(self, x, means, stdev):\n if means.size(-1) == x.size(-1):\n return x * stdev + means\n # Fallback for unexpected c_out != enc_in settings.\n if means.size(-1) > x.size(-1):\n return x * stdev[..., :x.size(-1)] + means[..., :x.size(-1)]\n mean = means.mean(dim=-1, keepdim=True).expand_as(x)\n std = stdev.mean(dim=-1, keepdim=True).expand_as(x)\n return x * std + mean\n\n def anomaly_detection(self, x_enc):\n \"\"\"\n Anomaly detection: reconstruct the input sequence.\n Input: x_enc [batch, seq_len, enc_in]\n Output: [batch, seq_len, c_out]\n \"\"\"\n means = x_enc.mean(dim=1, keepdim=True).detach()\n x = x_enc - means\n stdev = torch.sqrt(torch.var(x, dim=1, keepdim=True, unbiased=False) + 1e-5).detach()\n x = x / stdev\n\n _, length, _ = x.shape\n pos = self.pos_emb[:, :length, :]\n diag_mask = self.diag_mask[:length, :length].to(x.device)\n\n # ----- center-excluding multi-scale trend branch -----\n x_cf = x.transpose(1, 2) # [B, C, L]\n trend_candidates = [self._exclude_center_average(x_cf, kernel) for kernel in self.trend_kernels]\n trend_fused = self._consensus_fuse(trend_candidates)\n trend_features = torch.cat(trend_candidates + [trend_fused], dim=1).transpose(1, 2)\n trend_hidden = self.trend_embed(trend_features)\n trend_pred = self.trend_decoder(trend_hidden)\n\n # ----- learnable local blind-spot context -----\n loc\n... (truncated)",
5 "after_line": 284,
6 "start_line": 1,
7 "end_line": 284
8}Tool ResultStep 3
OK: Replaced lines 1..284 with 286 line(s) in Time-Series-Library/models/Custom.py. Editable range: entire file.
[Current file: Time-Series-Library/models/Custom.py | editable: 1–286 | total: 286 lines]
1 import math
2 import torch
3 import torch.nn as nn
4 import torch.nn.functional as F
5
6
7 class MaskedDepthwiseConv1d(nn.Module):
8 """Depthwise convolution with the center weight removed (blind-spot conv)."""
9
10 def __init__(self, channels, kernel_size):
11 super().__init__()
12 if kernel_size % 2 == 0:
13 raise ValueError("kernel_size must be odd for blind-spot convolution")
14 self.channels = channels
15 self.kernel_size = kernel_size
16 self.weight = nn.Parameter(torch.empty(channels, 1, kernel_size))
17 self.bias = nn.Parameter(torch.zeros(channels))
18
19 mask = torch.ones(1, 1, kernel_size)
AssistantStep 4test
Tool ResultStep 4
[Test #2] (1 test remaining; call submit(n=2) to submit this result)
### PSM (scripts/psm.sh)
Training metrics (last epochs):
TRAIN_METRICS epoch=1 train_loss=0.0461480 vali_loss=0.0482353 test_loss=0.0476402
TRAIN_METRICS epoch=2 train_loss=0.0351643 vali_loss=0.0398742 test_loss=0.0378076
TRAIN_METRICS epoch=3 train_loss=0.0285561 vali_loss=0.0344858 test_loss=0.0366046
Test results (PSM):
F-score: 0.9682, Precision: 0.9840, Recall: 0.9528, Accuracy: 0.9826
### MSL (scripts/msl.sh)
Training metrics (last epochs):
TRAIN_METRICS epoch=1 train_loss=0.2794474 vali_loss=0.2029450 test_loss=1.2659231
TRAIN_METRICS epoch=2 train_loss=0.1415169 vali_loss=0.1111713 test_loss=1.3518597
TRAIN_METRICS epoch=3 train_loss=0.0923944 vali_loss=0.0741441 test_loss=1.3613930
Test results (MSL):
F-score: 0.7900, Precision: 0.8551, Recall: 0.7340, Accuracy: 0.9588
[Leaderboard] Results saved: {'f_score_PSM': 0.9682, 'precision_PSM': 0.984, 'recall_PSM': 0.9528, 'elapsed_PSM': 663, 'f_score_MSL':
AssistantStep 5submit
Tool Input
1{
2 "n": 1
3}Tool ResultStep 5
[submit] Submitting result from test #1 as final. ## Seed 42 ### PSM (scripts/psm.sh) Training metrics (last epochs): TRAIN_METRICS epoch=1 train_loss=0.0457638 vali_loss=0.0492263 test_loss=0.0506896 TRAIN_METRICS epoch=2 train_loss=0.0362675 vali_loss=0.0432038 test_loss=0.0407345 TRAIN_METRICS epoch=3 train_loss=0.0314626 vali_loss=0.0360171 test_loss=0.0349524 Test results (PSM): F-score: 0.9679, Precision: 0.9859, Recall: 0.9506, Accuracy: 0.9825 ### MSL (scripts/msl.sh) Training metrics (last epochs): TRAIN_METRICS epoch=1 train_loss=0.2824120 vali_loss=0.2324335 test_loss=1.4687328 TRAIN_METRICS epoch=2 train_loss=0.1555498 vali_loss=0.1592765 test_loss=1.4163752 TRAIN_METRICS epoch=3 train_loss=0.1103255 vali_loss=0.1044642 test_loss=1.4349834 Test results (MSL): F-score: 0.8032, Precision: 0.8617, Recall: 0.7522, Accuracy: 0.9611 ### SMAP (scripts/smap.sh) Training metrics (last epochs): TRAIN_METRICS epoch=1 train_loss=0.1714113 vali_loss=0.0400381 test_loss=0.542471