Agent Conversation: anthropic/claude-opus-4.6
Task: causal-observational-nonlinear
Causal Discovery: Observational Nonlinear Data
Objective
Implement a causal discovery algorithm that recovers the DAG structure from purely observational data generated by a nonlinear Additive Noise Model (ANM). Your code goes in bench/custom_algorithm.py.
Background
Additive Noise Models (ANMs) assume structural equations of the form x_j = f_j(parents(j)) + e_j, where f_j are nonlinear functions and e_j are independent noise terms. Under mild conditions on f_j and the noise distribution, the causal DAG is identifiable from observational data alone (Hoyer et al., 2008; Peters et al., 2014). This goes beyond linear methods (LiNGAM, PC, GES) that assume linearity.
Key challenges include: (1) testing independence in high-dimensional nonlinear settings, (2) scaling to larger graphs, (3) handling diverse nonlinear function types (MLP, GP, polynomial, sigmoid), and (4) robustness to different noise distributions including the harder Gaussian noise case.
Evaluation Scenarios
| Label | Graph type | Nodes | Samples | Noise | Nonlinearity |
|---|---|---|---|---|---|
| ER8-MLP | Erdos-Renyi | 8 | 500 | Exponential | MLP |
| ER12-GP | Erdos-Renyi | 12 | 1000 | Laplace | GP |
| SF10-Mixed | Scale-Free (BA) | 10 | 500 | Uniform | Mixed |
| ER15-Sigmoid | Erdos-Renyi | 15 | 1000 | Exponential | Sigmoid |
| ER20-Mixed | Erdos-Renyi | 20 | 2000 | Laplace | Mixed |
| SF20-GP | Scale-Free (BA) | 20 | 2000 | Exponential | GP |
| ER12-LowSample | Erdos-Renyi | 12 | 150 | Laplace | Mixed |
| ER20-Gauss | Erdos-Renyi | 20 | 2000 | Gaussian | Mixed |
Metrics
All computed on the directed edge set (skeleton + direction must be correct):
- F1 (primary ranking metric), SHD, Precision, Recall
Baselines
cam: Causal Additive Models (Buehlmann et al., 2014) -- score-based with GAM regressionnotears_mlp: NOTEARS with MLP parametrization (Zheng et al., 2020) -- continuous optimizationdirectlingam: DirectLiNGAM (Shimizu et al., 2011) -- linear reference baseline
causal-learn/bench/custom_algorithm.py [EDITABLE — lines 3–14 only]
1: import numpy as np
2:
3: # =====================================================================
4: # EDITABLE: implement run_causal_discovery below
5: # =====================================================================
6: def run_causal_discovery(X: np.ndarray) -> np.ndarray:
7: """
8: Input: X of shape (n_samples, n_variables)
9: Output: adjacency matrix B of shape (n_variables, n_variables)
10: B[i, j] != 0 means j -> i (follows causal-learn convention)
11: """
12: n = X.shape[1]
13: return np.zeros((n, n))
14: # =====================================================================
cam baseline — editable region [READ-ONLY — reference implementation]
Lines 3–14:
1: import numpy as nTool Input
1{
2 "op": "replace",
3 "filename": "causal-learn/bench/custom_algorithm.py",
4 "start_line": 3,
5 "end_line": 14,
6 "content": "# ANM-GBR-HSIC: Pairwise ANM orientation with GBR + HSIC for causal ordering,\n# then importance-based parent selection with adaptive thresholding.\n# Novel contributions:\n# 1. GBR for flexible nonlinear regression in ANM framework (handles MLP/GP/sigmoid/mixed)\n# 2. HSIC-based pairwise orientation with weighted tournament ranking\n# 3. Adaptive importance threshold for parent selection\ndef run_causal_discovery(X: np.ndarray) -> np.ndarray:\n \"\"\"\n Input: X of shape (n_samples, n_variables)\n Output: adjacency matrix B of shape (n_variables, n_variables)\n B[i, j] != 0 means j -> i (follows causal-learn convention)\n \"\"\"\n from sklearn.ensemble import GradientBoostingRegressor\n from sklearn.preprocessing import StandardScaler\n import warnings\n warnings.filterwarnings('ignore')\n\n n_samples, n_vars = X.shape\n if n_vars <= 1:\n return np.zeros((n_vars, n_vars))\n\n X = StandardScaler().fit_transform(X)\n\n # Adaptive configuration\n hsic_n = min(n_samples, 500)\n n_trees = min(80, max(20, n_samples // 12))\n min_leaf = max(5, n_samples // 40)\n\n def hsic_score(x, y):\n \"\"\"HSIC statistic with RBF kernel and median bandwidth heuristic.\"\"\"\n rng = np.random.RandomState(42)\n n = len(x)\n if n > hsic_n:\n idx = rng.choice(n, hsic_n, replace=False)\n x, y = x[idx], y[idx]\n n = hsic_n\n x, y = x.ravel(), y.ravel()\n dx2 = (x[:, None] - x[None, :]) ** 2\n dy2 = (y[:, None] - y[None, :]) ** 2\n sx = max(np.median(dx2), 1e-8)\n sy = max(np.median(dy2), 1e-8)\n K = np.exp(-dx2 / (2 * sx))\n L = np.exp(-dy2 / (2 * sy))\n Kc = K - K.mean(0, keepdims=True) - K.mean(1, keepdims=True) + K.mean()\n Lc = L - L.mean(0, keepdims=True) - L.mean(1, keepdims=True) + L.mean()\n return float((Kc * Lc).sum()) / (n * n)\n\n def make_gbr():\n return GradientBoostingRegressor(\n n_estimators=n_trees, max_depth=3, learning_rate=0.1,\n subsample=0.8, min_samples_leaf=min_leaf, random_state=0)\n\n # === Step 1: Pairwise ANM Orientation ===\n # For each pair (i,j), fit both directions and compare residual independence\n wins = np.zeros(n_vars)\n for i in range(n_vars):\n for j in range(i + 1, n_vars):\n # Direction j -> i: fit X_i = f(X_j) + e\n g = make_gbr(); g.fit(X[:, j:j+1], X[:, i])\n h_ji = hsic_score(X[:, i] - g.predict(X[:, j:j+1]), X[:, j])\n\n # Direction i -> j: fit X_j = g(X_i) + e\n g = make_gbr(); g.fit(X[:, i:i+1], X[:, j])\n h_ij = hsic_score(X[:, j] - g.predict(X[:, i:i+1]), X[:, i])\n\n # Lower HSIC = more independent residuals = correct direction\n if h_ji < h_ij:\n wins[j] += 1 # j -> i, so j is more root-like\n else:\n wins[i] += 1 # i -> j, so i is more root-like\n\n # Tournament ranking: most wins = most root-like = earliest in ordering\n causal_order = list(np.argsort(-wins))\n\n # === Step 2: Parent Selection with Importance-based Pruning ===\n B = np.zeros((n_vars, n_vars))\n for idx in range(1, n_vars):\n target = causal_order[idx]\n cands = causal_order[:idx] # predecessors in causal order\n\n Xp = X[:, cands]\n ss_tot = np.sum((X[:, target] - X[:, target].mean()) ** 2)\n if ss_tot < 1e-10:\n continue\n\n g = make_gbr(); g.fit(Xp, X[:, target])\n yhat = g.predict(Xp)\n r2 = 1 - np.sum((X[:, target] - yhat) ** 2) / ss_tot\n\n if r2 < 0.03: # no meaningful relationship\n continue\n\n # Adaptive importance threshold\n imp = g.feature_importances_\n threshold = max(0.05, 1.0 / (len(cands) + 1))\n\n for k, p in enumerate(cands):\n if imp[k] > threshold:\n B[target, p] = 1.0\n\n return B\n"
7}OK: Replaced lines 3..14 with 101 line(s) in causal-learn/bench/custom_algorithm.py. Editable range: 3–103.
[Current file: causal-learn/bench/custom_algorithm.py | editable: 3–103 | total: 103 lines]
1| import numpy as np
2|
3 # ANM-GBR-HSIC: Pairwise ANM orientation with GBR + HSIC for causal ordering,
4 # then importance-based parent selection with adaptive thresholding.
5 # Novel contributions:
6 # 1. GBR for flexible nonlinear regression in ANM framework (handles MLP/GP/sigmoid/mixed)
7 # 2. HSIC-based pairwise orientation with weighted tournament ranking
8 # 3. Adaptive importance threshold for parent selection
9 def run_causal_discovery(X: np.ndarray) -> np.ndarray:
10 """
11 Input: X of shape (n_samples, n_variables)
12 Output: adjacency matrix B of shape (n_variables, n_variables)
13 B[i, j] != 0 means j -> i (follows causal-learn convention)
14 """
15 from skl
[Test #1] (2 tests remaining; call submit(n=1) to submit this result)
### ER8-MLP (scripts/eval_er8_mlp.sh)
Results (ER8-MLP):
SHD=8 F1=0.4444 Precision=0.3636 Recall=0.5714
### ER12-GP (scripts/eval_er12_gp.sh)
Results (ER12-GP):
SHD=13 F1=0.7083 Precision=0.5862 Recall=0.8947
### SF10-Mixed (scripts/eval_sf10_mixed.sh)
Results (SF10-Mixed):
SHD=16 F1=0.4000 Precision=0.4286 Recall=0.3750
### ER15-Sigmoid (scripts/eval_er15_sigmoid.sh)
Results (ER15-Sigmoid):
SHD=20 F1=0.6769 Precision=0.5641 Recall=0.8462
### ER20-Mixed (scripts/eval_er20_mixed.sh)
Results (ER20-Mixed):
SHD=57 F1=0.2913 Precision=0.3333 Recall=0.2586
### SF20-GP (scripts/eval_sf20_gp.sh)
Results (SF20-GP):
SHD=16 F1=0.7200 Precision=0.6923 Recall=0.7500
### ER12-LowSample (scripts/eval_er12_lowsample.sh)
Results (ER12-LowSample):
SHD=24 F1=0.2857 Precision=0.2609 Recall=0.3158
[Leaderboard] Results saved: {'shd_ER8-MLP': 8, 'f1_ER8-MLP': 0.4444, 'precision_ER8-MLP': 0.3636,
Tool Input
1{
2 "op": "replace",
3 "filename": "causal-learn/bench/custom_algorithm.py",
4 "start_line": 3,
5 "end_line": 103,
6 "content": "# ANM-GBR-HSIC v2: Improved causal ordering + permutation-importance parent selection\n# Novel contributions:\n# 1. HSIC-sum root scoring for robust causal ordering (not binary tournament)\n# 2. Pairwise R^2 screening to reduce candidate set before multivariate fitting\n# 3. Held-out permutation importance for principled parent selection\n# 4. Dual-criterion edge inclusion: requires both pairwise evidence AND multivariate contribution\ndef run_causal_discovery(X: np.ndarray) -> np.ndarray:\n \"\"\"\n Input: X of shape (n_samples, n_variables)\n Output: adjacency matrix B of shape (n_variables, n_variables)\n B[i, j] != 0 means j -> i (follows causal-learn convention)\n \"\"\"\n from sklearn.ensemble import GradientBoostingRegressor\n from sklearn.preprocessing import StandardScaler\n import warnings\n warnings.filterwarnings('ignore')\n\n n_samples, n_vars = X.shape\n if n_vars <= 1:\n return np.zeros((n_vars, n_vars))\n\n X = StandardScaler().fit_transform(X)\n\n # Adaptive configuration\n hsic_n = min(n_samples, 500)\n n_trees = min(80, max(20, n_samples // 12))\n min_leaf = max(5, n_samples // 40)\n\n def hsic_score(x, y):\n \"\"\"HSIC with RBF kernel and median bandwidth heuristic.\"\"\"\n rng = np.random.RandomState(42)\n n = len(x)\n if n > hsic_n:\n idx = rng.choice(n, hsic_n, replace=False)\n x, y = x[idx], y[idx]\n n = hsic_n\n x, y = x.ravel(), y.ravel()\n dx2 = (x[:, None] - x[None, :]) ** 2\n dy2 = (y[:, None] - y[None, :]) ** 2\n sx = max(np.median(dx2), 1e-8)\n sy = max(np.median(dy2), 1e-8)\n K = np.exp(-dx2 / (2 * sx))\n L = np.exp(-dy2 / (2 * sy))\n Kc = K - K.mean(0, keepdims=True) - K.mean(1, keepdims=True) + K.mean()\n Lc = L - L.mean(0, keepdims=True) - L.mean(1, keepdims=True) + L.mean()\n return float((Kc * Lc).sum()) / (n * n)\n\n def make_gbr():\n return GradientBoostingRegressor(\n n_estimators=n_trees, max_depth=3, learning_rate=0.1,\n subsample=0.8, min_samples_leaf=min_leaf, random_state=0)\n\n # === Step 1: Pairwise ANM tests — store HSIC and R^2 ===\n h_cause = np.zeros((n_vars, n_vars)) # h_cause[c,e] = HSIC when c causes e\n r2_cause = np.zeros((n_vars, n_vars)) # r2_cause[c,e] = training R^2 when c causes e\n ss = np.array([np.sum((X[:, i] - X[:, i].mean()) ** 2) + 1e-10\n for i in range(n_vars)])\n\n for i in range(n_vars):\n for j in range(i + 1, n_vars):\n # Direction j -> i\n g = make_gbr(); g.fit(X[:, j:j+1], X[:, i])\n p = g.predict(X[:, j:j+1])\n h_cause[j, i] = hsic_score(X[:, i] - p, X[:, j])\n r2_cause[j, i] = max(0, 1 - np.sum((X[:, i] - p) ** 2) / ss[i])\n\n # Direction i -> j\n g = make_gbr(); g.fit(X[:, i:i+1], X[:, j])\n p = g.predict(X[:, i:i+1])\n h_cause[i, j] = hsic_score(X[:, j] - p, X[:, i])\n r2_cause[i, j] = max(0, 1 - np.sum((X[:, j] - p) ** 2) / ss[j])\n\n # Root score: sum of HSIC when variable acts as cause — lower = more root-like\n root_score = h_cause.sum(axis=1)\n order = list(np.argsort(root_score))\n pos = {v: i for i, v in enumerate(order)}\n\n # === Step 2: Parent selection with screening + permutation importance ===\n B = np.zeros((n_vars, n_vars))\n\n for target in range(n_vars):\n # Screen: predecessors with pairwise R^2 evidence\n cands = []\n for p in range(n_vars):\n if p == target or pos[p] >= pos[target]:\n continue\n if max(r2_cause[p, target], r2_cause[target, p]) > 0.01:\n cands.append(p)\n\n if not cands:\n continue\n\n if len(cands) == 1:\n if r2_cause[cands[0], target] > 0.03:\n B[target, cands[0]] = 1.0\n continue\n\n # Permutation importance on held-out data (3 random splits)\n n_splits = 3\n perm_imp = np.zeros(len(cands))\n valid = 0\n\n for s in range(n_splits):\n rng_s = np.random.RandomState(s * 100)\n idx = rng_s.permutation(n_samples)\n nt = int(0.75 * n_samples) if n_samples >= 300 else int(0.8 * n_samples)\n nt = max(min(nt, n_samples - 15), n_samples // 2)\n ti, vi = idx[:nt], idx[nt:]\n\n if len(vi) < 10:\n continue\n\n g = make_gbr()\n g.fit(X[ti][:, cands], X[ti, target])\n yh = g.predict(X[vi][:, cands])\n bm = np.mean((X[vi, target] - yh) ** 2)\n vv = np.var(X[vi, target])\n\n if bm < 1e-10 or vv < 1e-10 or bm > vv * 0.97:\n continue\n\n valid += 1\n for k in range(len(cands)):\n Xp = X[vi][:, cands].copy()\n np.random.RandomState(s * 1000 + k).shuffle(Xp[:, k])\n pm = np.mean((X[vi, target] - g.predict(Xp)) ** 2)\n perm_imp[k] += (pm - bm) / (bm + 1e-10)\n\n if valid > 0:\n perm_imp /= valid\n for k, p in enumerate(cands):\n if perm_imp[k] > 0.03:\n B[target, p] = 1.0\n else:\n # Fallback: feature importance with conservative threshold\n g = make_gbr()\n g.fit(X[:, cands], X[:, target])\n yh = g.predict(X[:, cands])\n r2 = 1 - np.sum((X[:, target] - yh) ** 2) / ss[target]\n if r2 > 0.03:\n imp = g.feature_importances_\n th = max(0.08, 1.0 / (len(cands) + 1))\n for k, p in enumerate(cands):\n if imp[k] > th:\n B[target, p] = 1.0\n\n return B\n"
7}OK: Replaced lines 3..103 with 147 line(s) in causal-learn/bench/custom_algorithm.py. Editable range: 3–149.
[Current file: causal-learn/bench/custom_algorithm.py | editable: 3–149 | total: 149 lines]
1| import numpy as np
2|
3 # ANM-GBR-HSIC v2: Improved causal ordering + permutation-importance parent selection
4 # Novel contributions:
5 # 1. HSIC-sum root scoring for robust causal ordering (not binary tournament)
6 # 2. Pairwise R^2 screening to reduce candidate set before multivariate fitting
7 # 3. Held-out permutation importance for principled parent selection
8 # 4. Dual-criterion edge inclusion: requires both pairwise evidence AND multivariate contribution
9 def run_causal_discovery(X: np.ndarray) -> np.ndarray:
10 """
11 Input: X of shape (n_samples, n_variables)
12 Output: adjacency matrix B of shape (n_variables, n_variables)
13 B[i, j] != 0 means j -> i (follows causal-learn
[Test #2] (1 test remaining; call submit(n=2) to submit this result)
### ER8-MLP (scripts/eval_er8_mlp.sh)
Results (ER8-MLP):
SHD=6 F1=0.7000 Precision=0.5385 Recall=1.0000
### ER12-GP (scripts/eval_er12_gp.sh)
Results (ER12-GP):
SHD=21 F1=0.4898 Precision=0.4000 Recall=0.6316
### SF10-Mixed (scripts/eval_sf10_mixed.sh)
Results (SF10-Mixed):
SHD=18 F1=0.4000 Precision=0.3333 Recall=0.5000
### ER15-Sigmoid (scripts/eval_er15_sigmoid.sh)
Results (ER15-Sigmoid):
SHD=25 F1=0.6216 Precision=0.4792 Recall=0.8846
### ER20-Mixed (scripts/eval_er20_mixed.sh)
Results (ER20-Mixed):
SHD=64 F1=0.3520 Precision=0.3284 Recall=0.3793
### SF20-GP (scripts/eval_sf20_gp.sh)
Results (SF20-GP):
SHD=20 F1=0.6329 Precision=0.5814 Recall=0.6944
### ER12-LowSample (scripts/eval_er12_lowsample.sh)
Results (ER12-LowSample):
SHD=33 F1=0.3509 Precision=0.2632 Recall=0.5263
[Leaderboard] Results saved: {'shd_ER8-MLP': 6, 'f1_ER8-MLP': 0.7, 'precision_ER8-MLP': 0.5385, 'rec
Tool Input
1{
2 "op": "replace",
3 "filename": "causal-learn/bench/custom_algorithm.py",
4 "start_line": 3,
5 "end_line": 149,
6 "content": "# ANM-GBR-HSIC v3: Binary tournament ordering + R^2-screened parent selection\n# Novel contributions:\n# 1. Pairwise ANM tests with GBR + HSIC for flexible nonlinear causal orientation\n# 2. Pairwise R^2 pre-screening to focus multivariate fits on plausible parents\n# 3. Adaptive importance thresholding with screening-aware calibration\n# 4. Dual-use of pairwise fits: both orientation (HSIC) and screening (R^2)\ndef run_causal_discovery(X: np.ndarray) -> np.ndarray:\n \"\"\"\n Input: X of shape (n_samples, n_variables)\n Output: adjacency matrix B of shape (n_variables, n_variables)\n B[i, j] != 0 means j -> i (follows causal-learn convention)\n \"\"\"\n from sklearn.ensemble import GradientBoostingRegressor\n from sklearn.preprocessing import StandardScaler\n import warnings\n warnings.filterwarnings('ignore')\n\n n_samples, n_vars = X.shape\n if n_vars <= 1:\n return np.zeros((n_vars, n_vars))\n\n X = StandardScaler().fit_transform(X)\n\n # Adaptive configuration\n hsic_n = min(n_samples, 500)\n n_trees = min(80, max(20, n_samples // 12))\n min_leaf = max(5, n_samples // 40)\n\n def hsic_score(x, y):\n \"\"\"HSIC with RBF kernel and median bandwidth heuristic.\"\"\"\n rng = np.random.RandomState(42)\n n = len(x)\n if n > hsic_n:\n idx = rng.choice(n, hsic_n, replace=False)\n x, y = x[idx], y[idx]\n n = hsic_n\n x, y = x.ravel(), y.ravel()\n dx2 = (x[:, None] - x[None, :]) ** 2\n dy2 = (y[:, None] - y[None, :]) ** 2\n sx = max(np.median(dx2), 1e-8)\n sy = max(np.median(dy2), 1e-8)\n K = np.exp(-dx2 / (2 * sx))\n L = np.exp(-dy2 / (2 * sy))\n Kc = K - K.mean(0, keepdims=True) - K.mean(1, keepdims=True) + K.mean()\n Lc = L - L.mean(0, keepdims=True) - L.mean(1, keepdims=True) + L.mean()\n return float((Kc * Lc).sum()) / (n * n)\n\n def make_gbr():\n return GradientBoostingRegressor(\n n_estimators=n_trees, max_depth=3, learning_rate=0.1,\n subsample=0.8, min_samples_leaf=min_leaf, random_state=0)\n\n # === Step 1: Pairwise ANM tests — orientation + R^2 for screening ===\n wins = np.zeros(n_vars)\n r2_pair = np.zeros((n_vars, n_vars)) # r2_pair[c,e] = R^2 when c causes e\n ss = np.array([np.sum((X[:, i] - X[:, i].mean()) ** 2) + 1e-10\n for i in range(n_vars)])\n\n for i in range(n_vars):\n for j in range(i + 1, n_vars):\n # Direction j -> i\n g = make_gbr(); g.fit(X[:, j:j+1], X[:, i])\n pred = g.predict(X[:, j:j+1])\n h_ji = hsic_score(X[:, i] - pred, X[:, j])\n r2_pair[j, i] = max(0, 1 - np.sum((X[:, i] - pred) ** 2) / ss[i])\n\n # Direction i -> j\n g = make_gbr(); g.fit(X[:, i:i+1], X[:, j])\n pred = g.predict(X[:, i:i+1])\n h_ij = hsic_score(X[:, j] - pred, X[:, i])\n r2_pair[i, j] = max(0, 1 - np.sum((X[:, j] - pred) ** 2) / ss[j])\n\n # Lower HSIC = more independent residuals = correct direction\n if h_ji < h_ij:\n wins[j] += 1 # j -> i\n else:\n wins[i] += 1 # i -> j\n\n # Binary tournament ranking (most wins = most root-like = first)\n causal_order = list(np.argsort(-wins))\n\n # === Step 2: Screened parent selection with importance-based pruning ===\n B = np.zeros((n_vars, n_vars))\n order_pos = {v: i for i, v in enumerate(causal_order)}\n\n for idx in range(1, n_vars):\n target = causal_order[idx]\n all_preds = causal_order[:idx]\n\n # Screen: only predecessors with pairwise R^2 evidence\n cands = []\n for p in all_preds:\n if max(r2_pair[p, target], r2_pair[target, p]) > 0.01:\n cands.append(p)\n\n if not cands:\n continue\n\n if len(cands) == 1:\n # Single candidate: require moderate R^2\n if r2_pair[cands[0], target] > 0.04:\n B[target, cands[0]] = 1.0\n continue\n\n # Fit GBR with screened candidates\n Xp = X[:, cands]\n g = make_gbr(); g.fit(Xp, X[:, target])\n yhat = g.predict(Xp)\n r2 = 1 - np.sum((X[:, target] - yhat) ** 2) / ss[target]\n\n if r2 < 0.03:\n continue\n\n # Feature importance with screening-aware adaptive threshold\n imp = g.feature_importances_\n threshold = max(0.07, 1.2 / (len(cands) + 1))\n\n for k, p in enumerate(cands):\n if imp[k] > threshold:\n B[target, p] = 1.0\n\n return B\n"
7}OK: Replaced lines 3..149 with 122 line(s) in causal-learn/bench/custom_algorithm.py. Editable range: 3–124.
[Current file: causal-learn/bench/custom_algorithm.py | editable: 3–124 | total: 124 lines]
1| import numpy as np
2|
3 # ANM-GBR-HSIC v3: Binary tournament ordering + R^2-screened parent selection
4 # Novel contributions:
5 # 1. Pairwise ANM tests with GBR + HSIC for flexible nonlinear causal orientation
6 # 2. Pairwise R^2 pre-screening to focus multivariate fits on plausible parents
7 # 3. Adaptive importance thresholding with screening-aware calibration
8 # 4. Dual-use of pairwise fits: both orientation (HSIC) and screening (R^2)
9 def run_causal_discovery(X: np.ndarray) -> np.ndarray:
10 """
11 Input: X of shape (n_samples, n_variables)
12 Output: adjacency matrix B of shape (n_variables, n_variables)
13 B[i, j] != 0 means j -> i (follows causal-learn convention)
14
[Test #3] (0 tests remaining) ### ER8-MLP (scripts/eval_er8_mlp.sh) Results (ER8-MLP): SHD=9 F1=0.3529 Precision=0.3000 Recall=0.4286 ### ER12-GP (scripts/eval_er12_gp.sh) Results (ER12-GP): SHD=9 F1=0.7500 Precision=0.7143 Recall=0.7895 ### SF10-Mixed (scripts/eval_sf10_mixed.sh) Results (SF10-Mixed): SHD=16 F1=0.3571 Precision=0.4167 Recall=0.3125 ### ER15-Sigmoid (scripts/eval_er15_sigmoid.sh) Results (ER15-Sigmoid): SHD=20 F1=0.6441 Precision=0.5758 Recall=0.7308 ### ER20-Mixed (scripts/eval_er20_mixed.sh) Results (ER20-Mixed): SHD=58 F1=0.2474 Precision=0.3077 Recall=0.2069 ### SF20-GP (scripts/eval_sf20_gp.sh) Results (SF20-GP): SHD=13 F1=0.7429 Precision=0.7647 Recall=0.7222 ### ER12-LowSample (scripts/eval_er12_lowsample.sh) Results (ER12-LowSample): SHD=24 F1=0.2162 Precision=0.2222 Recall=0.2105 ### ER20-Gauss (scripts/eval_er20_gauss.sh) Results (ER20-Gauss): SHD=50 F1=0.3600 Precision=0.4286 Recall=0.3103 [Leaderboard] Result