Agent Conversation: gpt-5.4-pro

Model: gpt-5.4-pro
Total steps: 5
Messages: 11
UserStep 0

Task: causal-observational-linear-gaussian

Causal Discovery: Observational Linear Gaussian Data (CPDAG Recovery)

Objective

Implement a causal discovery algorithm that recovers the CPDAG from purely observational data generated by a linear Gaussian SEM. Your code goes in bench/custom_algorithm.py.

Background

Under linear Gaussian assumptions, observational data can only identify the Markov Equivalence Class (MEC), not a unique fully directed DAG. The MEC is represented by a CPDAG.

Interface

Your run_causal_discovery(X) implementation must return a causallearn.graph.GeneralGraph.GeneralGraph object representing the estimated CPDAG.

Evaluation Scenarios

LabelGraph typeNodesParamsSamplesNoise
ER10Erdos-Renyi10p=0.35001.0
ER20Erdos-Renyi20p=0.210001.0
SF50Scale-Free (BA)50m=220001.0
ER10-HardErdos-Renyi10p=0.5 (denser)2001.0
ER20-HardErdos-Renyi20p=0.35 (denser)4001.0
SF50-HardScale-Free (BA)50m=3 (denser)10001.0
ER10-NoisyErdos-Renyi10p=0.5 (denser)2002.5
ER20-NoisyErdos-Renyi20p=0.35 (denser)4002.5
SF50-NoisyScale-Free (BA)50m=3 (denser)10002.5

Metrics

Metrics are computed between estimated CPDAG and ground-truth CPDAG:

  • SHD (SHD(...).get_shd())
  • Adjacency Precision / Recall (AdjacencyConfusion)
  • Arrow Precision / Recall (ArrowConfusion)

Baselines

  • pc: Peter-Clark (constraint-based)
  • ges: Greedy Equivalence Search (score-based)
  • grasp: Greedy Relaxations of the Sparsest Permutation
  • boss: Best Order Score Search

causal-learn/bench/custom_algorithm.py [EDITABLE — lines 3–14 only]

     1: import numpy as np
     2: from causallearn.graph.GeneralGraph import GeneralGraph
     3: from causallearn.graph.GraphNode import GraphNode
     4: 
     5: # =====================================================================
     6: # EDITABLE: implement run_causal_discovery below
     7: # =====================================================================
     8: def run_causal_discovery(X: np.ndarray) -> GeneralGraph:
     9:     """
    10:     Input:  X of shape (n_samples, n_variables)
    11:     Output: estimated CPDAG as causallearn.graph.GeneralGraph.GeneralGraph
    12:     """
    13:     nodes = [GraphNode(f"X{i + 1}") for i in range(X.shape[1])]
    14:     return GeneralGraph(nodes)
    15: # =====================================================================

pc baseline — editable region [READ-ONLY — reference implementation]

Lines 314:
     1: import numpy as np
     2: from causallearn.graph.GeneralGraph import GeneralGraph
     3: from causallearn.graph.GraphNode import GraphNode
     4: 
     5: # =====================================================================
     6: # EDITABLE: imp
AssistantStep 1edit
Tool Input
1{
2  "op": "replace",
3  "filename": "causal-learn/bench/custom_algorithm.py",
4  "content": "import numpy as np\nfrom causallearn.graph.GeneralGraph import GeneralGraph\nfrom causallearn.graph.GraphNode import GraphNode\n\n# =====================================================================\n# EDITABLE: implement run_causal_discovery below\n# =====================================================================\ndef run_causal_discovery(X: np.ndarray) -> GeneralGraph:\n    \"\"\"\n    Hybrid evidence-weighted restricted-PC for linear-Gaussian CPDAG recovery.\n    It combines a liberal/high-recall PC skeleton, a conservative PC skeleton,\n    and a score-based GES proposal, then performs shrinkage Fisher-Z pruning and\n    conservative collider orientation followed by Meek-style closure.\n    \"\"\"\n    import math\n    from itertools import combinations\n    from causallearn.graph.Endpoint import Endpoint\n\n    n, p = X.shape\n    nodes = [GraphNode(f\"X{i + 1}\") for i in range(p)]\n\n    def _mat(G):\n        try:\n            M = np.array(G.graph, dtype=int, copy=True)\n        except Exception:\n            M = np.zeros((p, p), dtype=int)\n        if M.shape != (p, p):\n            M = np.zeros((p, p), dtype=int)\n        np.fill_diagonal(M, 0)\n        return M\n\n    def _adj(M):\n        A = (M != 0) | (M.T != 0)\n        np.fill_diagonal(A, False)\n        return A\n\n    def _run_pc(alpha):\n        try:\n            from causallearn.utils.PCUtils import SkeletonDiscovery, Meek, UCSepset\n            from causallearn.utils.cit import CIT\n\n            cg = SkeletonDiscovery.skeleton_discovery(X, alpha, CIT(X, \"fisherz\"), stable=True)\n            try:\n                cg = UCSepset.uc_sepset(cg, 0, uc_rule=0, uc_priority=2)\n            except TypeError:\n                cg = UCSepset.uc_sepset(cg, 0)\n            cg = Meek.meek(cg)\n            return cg.G if hasattr(cg, \"G\") else cg\n        except Exception:\n            try:\n                from causallearn.search.ConstraintBased.PC import pc\n\n                out = pc(X, alpha=alpha, indep_test=\"fisherz\", stable=True, uc_rule=0, uc_priority=2)\n                return out.G if hasattr(out, \"G\") else out\n            except Exception:\n                try:\n                    from bench.pc import run_causal_discovery as pc_run\n\n                    return pc_run(X)\n                except Exception:\n                    return GeneralGraph(nodes)\n\n    def _run_ges():\n        try:\n            from bench.ges import run_causal_discovery as ges_run\n\n            return ges_run(X)\n        except Exception:\n            try:\n                from causallearn.search.ScoreBased.GES import ges\n\n                out = ges(X, score_func=\"local_score_BIC\")\n                if isinstance(out, dict):\n                    out = out.get(\"G\", next(iter(out.values())))\n                return out.G if hasattr(out, \"G\") else out\n            except Exception:\n                return _run_pc(0.05)\n\n    ratio = n / max(p, 1)\n    alpha_hi = 0.12 if ratio < 15 else (0.10 if ratio < 30 else 0.07)\n    alpha_lo = 0.03 if ratio < 15 else 0.02\n\n    M_hi = _mat(_run_pc(alpha_hi))\n    M_lo = _mat(_run_pc(alpha_lo))\n    M_sc = _mat(_run_ges())\n    A_hi, A_lo, A_sc = _adj(M_hi), _adj(M_lo), _adj(M_sc)\n\n    support = A_hi.astype(int) + 2 * A_lo.astype(int) + 2 * A_sc.astype(int)\n    cand = A_hi | A_lo | A_sc\n    np.fill_diagonal(cand, False)\n\n    Z = X.astype(float) - X.mean(axis=0, keepdims=True)\n    sd = Z.std(axis=0, ddof=1, keepdims=True)\n    sd[sd < 1e-8] = 1.0\n    Z = Z / sd\n    R = (Z.T @ Z) / max(n - 1, 1)\n    R = np.clip((R + R.T) / 2.0, -1.0, 1.0)\n    np.fill_diagonal(R, 1.0)\n    lam = float(min(0.25, max(0.02, p / float(max(n + p, 1)))))\n    R = (1.0 - lam) * R + lam * np.eye(p)\n\n    TAIL, ARROW = Endpoint.TAIL.value, Endpoint.ARROW.value\n\n    def _is_dir(M, i, j):\n        return M[i, j] == TAIL and M[j, i] == ARROW\n\n    p_cache = {}\n\n    def _fisher_p(i, j, S):\n        a, b = (i, j) if i < j else (j, i)\n        S = tuple(sorted(int(s) for s in S))\n        key = (a, b, S)\n        if key not in p_cache:\n            idx = [a, b] + list(S)\n            P = np.linalg.pinv(R[np.ix_(idx, idx)])\n            denom = max(float(P[0, 0] * P[1, 1]), 1e-12)\n            rho = float(np.clip(-P[0, 1] / math.sqrt(denom), -0.999999, 0.999999))\n            z = abs(float(np.arctanh(rho))) * math.sqrt(max(n - len(S) - 3, 1))\n            p_cache[key] = float(math.erfc(z / math.sqrt(2.0)))\n        return p_cache[key]\n\n    max_k = 4 if p <= 20 else 3\n    pool_cap = 12 if p <= 20 else 10\n\n    def _pool(i, j, adj_ref, forbid=()):\n        forbid = set(forbid)\n        base = [k for k in range(p) if k not in forbid and k not in (i, j) and (adj_ref[i, k] or adj_ref[j, k])]\n        extra = [k for k in range(p) if k not in forbid and k not in (i, j) and k not in base]\n\n        def score(k):\n            return (support[i, k] + support[j, k], abs(R[i, k]) + abs(R[j, k]))\n\n        base = sorted(base, key=score, reverse=True)\n        if len(base) < pool_cap:\n            base += sorted(extra, key=score, reverse=True)[: max(0, min(2, pool_cap - len(base)))]\n        return base[: max(pool_cap, max_k + 1)]\n\n    def _tau(i, j):\n        return 0.015 + 0.015 * float(support[i, j])\n\n    adj = cand.copy()\n    sepsets = {}\n    for ell in range(max_k + 1):\n        snap = adj.copy()\n        edges = [(i, j) for i in range(p) for j in range(i + 1, p) if snap[i, j]]\n        for i, j in edges:\n            pool = _pool(i, j, snap)\n            if len(pool) < ell:\n                continue\n            best = None\n            cutoff = _tau(i, j)\n            for S in combinations(pool, ell):\n                pv = _fisher_p(i, j, S)\n                if pv > cutoff and (best is None or pv > best[0]):\n                    best = (pv, S)\n            if best is not None:\n                adj[i, j] = adj[j, i] = False\n                sepsets[(i, j)] = tuple(best[1])\n\n    sep_cache = {}\n\n    def _find_sep(i, j, forbid=None, cutoff=0.02):\n        a, b = (i, j) if i < j else (j, i)\n        key = (a, b, forbid)\n        if key in sep_cache:\n            return sep_cache[key]\n        if (a, b) in sepsets and (forbid is None or forbid not in sepsets[(a, b)]):\n            sep_cache[key] = sepsets[(a, b)]\n            return sep_cache[key]\n        pool = _pool(a, b, cand, () if forbid is None else (forbid,))\n        for ell in range(min(max_k, len(pool)) + 1):\n            best = None\n            for S in combinations(pool, ell):\n                pv = _fisher_p(a, b, S)\n                if pv > cutoff and (best is None or pv > best[0]):\n                    best = (pv, S)\n            if best is not None:\n                sep_cache[key] = tuple(best[1])\n                return sep_cache[key]\n        sep_cache[key] = None\n        return None\n\n    dirn = np.zeros((p, p), dtype=bool)\n\n    def _undir(i, j):\n        return adj[i, j] and not dirn[i, j] and not dirn[j, i]\n\n    def _orient(i, j):\n        if not adj[i, j] or dirn[j, i] or dirn[i, j]:\n            return False\n        dirn[i, j] = True\n        dirn[j, i] = False\n        return True\n\n    for k in range(p):\n        nbrs = [i for i in range(p) if adj[k, i]]\n        for a in range(len(nbrs)):\n            i = nbrs[a]\n            for j in nbrs[a + 1 :]:\n                if adj[i, j]:\n                    continue\n                sep = _find_sep(i, j, forbid=k, cutoff=0.025)\n                prior = (_is_dir(M_lo, i, k) and _is_dir(M_lo, j, k)) or (\n                    _is_dir(M_sc, i, k) and _is_dir(M_sc, j, k) and support[i, k] + support[j, k] >= 4\n                )\n                if sep is not None or prior:\n                    _orient(i, k)\n                    _orient(j, k)\n\n    def _reach():\n        reach = dirn.copy()\n        for m in range(p):\n            reach = reach | (reach[:, [m]] & reach[[m], :])\n        np.fill_diagonal(reach, False)\n        return reach\n\n    changed = True\n    while changed:\n        changed = False\n        reach = _reach()\n        todo = []\n        for i in range(p):\n            for j in range(i + 1, p):\n                if not _undir(i, j):\n                    continue\n                if any(dirn[k, i] and not adj[k, j] for k in range(p) if k not in (i, j)):\n                    todo.append((i, j))\n                    continue\n                if any(dirn[k, j] and not adj[k, i] for k in range(p) if k not in (i, j)):\n                    todo.append((j, i))\n                    continue\n                if reach[i, j] and not reach[j, i]:\n                    todo.append((i, j))\n                    continue\n                if reach[j, i] and not reach[i, j]:\n                    todo.append((j, i))\n                    continue\n                pa_j = [k for k in range(p) if k not in (i, j) and dirn[k, j] and _undir(i, k)]\n                if any(not adj[a, b] for x, a in enumerate(pa_j) for b in pa_j[x + 1 :]):\n                    todo.append((i, j))\n                    continue\n                pa_i = [k for k in range(p) if k not in (i, j) and dirn[k, i] and _undir(j, k)]\n                if any(not adj[a, b] for x, a in enumerate(pa_i) for b in pa_i[x + 1 :]):\n                    todo.append((j, i))\n                    continue\n        for a, b in todo:\n            changed = _orient(a, b) or changed\n\n    M = np.zeros((p, p), dtype=int)\n    for i in range(p):\n        for j in range(i + 1, p):\n            if not adj[i, j]:\n                continue\n            if dirn[i, j]:\n                M[i, j], M[j, i] = TAIL, ARROW\n            elif dirn[j, i]:\n                M[i, j], M[j, i] = ARROW, TAIL\n            else:\n                M[i, j] = M[j, i] = TAIL\n\n    G = GeneralGraph(nodes)\n    G.graph = M\n    return G\n# =====================================================================",
5  "after_line": 0,
6  "start_line": 3,
7  "end_line": 14
8}
Tool ResultStep 1
OK: Replaced lines 3..14 with 264 line(s) in causal-learn/bench/custom_algorithm.py. Editable range: 3–266.

[Current file: causal-learn/bench/custom_algorithm.py | editable: 3–266 | total: 267 lines]
     1| import numpy as np
     2| from causallearn.graph.GeneralGraph import GeneralGraph
     3  import numpy as np
     4  from causallearn.graph.GeneralGraph import GeneralGraph
     5  from causallearn.graph.GraphNode import GraphNode
     6  
     7  # =====================================================================
     8  # EDITABLE: implement run_causal_discovery below
     9  # =====================================================================
    10  def run_causal_discovery(X: np.ndarray) -> GeneralGraph:
    11      """
    12      Hybrid evidence-weighted restricted-PC for linear-Gaussian CPDAG recovery.
    13      It combines a liberal/high-recall PC skeleton, a conservative PC skeleton,
    14      and a score-based GES proposal, then performs shrinkage Fisher-Z p
AssistantStep 2test
Tool ResultStep 2
[Test #1] (2 tests remaining; call submit(n=1) to submit this result)

### ER10 (scripts/eval_er10.sh)
Results (ER10):
  SHD=11  AdjP=1.0000 AdjR=0.7857  ArrowP=0.2727 ArrowR=0.2727

### ER20 (scripts/eval_er20.sh)
Results (ER20):
  SHD=38  AdjP=0.9583 AdjR=0.5610  ArrowP=0.1304 ArrowR=0.0882

### SF50 (scripts/eval_sf50.sh)
Results (SF50):
  SHD=91  AdjP=0.8966 AdjR=0.5417  ArrowP=0.1964 ArrowR=0.1486

### ER10-Hard (scripts/eval_er10_hard.sh)
Results (ER10-Hard):
  SHD=22  AdjP=1.0000 AdjR=0.3077  ArrowP=0.5714 ArrowR=0.1538

### ER20-Hard (scripts/eval_er20_hard.sh)
Results (ER20-Hard):
  SHD=71  AdjP=0.8400 AdjR=0.3043  ArrowP=0.0800 ArrowR=0.0294

### SF50-Hard (scripts/eval_sf50_hard.sh)

  0%|          | 0/50 [00:00<?, ?it/s]
  0%|          | 0/50 [00:00<?, ?it/s]
Depth=0, working on node 0:   2%|▏         | 1/50 [00:00<00:00, 6017.65it/s]
Depth=0, working on node 1:   4%|▍         | 2/50 [00:00<00:00, 404.60it/s] 
Depth=0, working on node 2:   6%|▌         | 3/50 [00:00<00:00, 
AssistantStep 3edit
Tool Input
1{
2  "op": "replace",
3  "filename": "causal-learn/bench/custom_algorithm.py",
4  "content": "from causallearn.graph.GraphNode import GraphNode\n\n# =====================================================================\n# EDITABLE: implement run_causal_discovery below\n# =====================================================================\ndef run_causal_discovery(X: np.ndarray) -> GeneralGraph:\n    \"\"\"\n    Screened BOSS-PC hybrid:\n    1) obtain an order-search CPDAG proposal (BOSS/GRaSP fallback),\n    2) add a high-recall shrinkage-screened skeleton (and PC when affordable),\n    3) prune by restricted Fisher-Z tests with edge-adaptive cutoffs,\n    4) orient using baseline directional priors, conservative collider tests,\n       and Meek closure.\n    \"\"\"\n    import math\n    from itertools import combinations\n    from causallearn.graph.Endpoint import Endpoint\n\n    n, p = X.shape\n    nodes = [GraphNode(f\"X{i + 1}\") for i in range(p)]\n    TAIL, ARROW = Endpoint.TAIL.value, Endpoint.ARROW.value\n\n    def _to_mat(G):\n        try:\n            M = np.array(G.graph, dtype=int, copy=True)\n        except Exception:\n            M = np.zeros((p, p), dtype=int)\n        if M.shape != (p, p):\n            M = np.zeros((p, p), dtype=int)\n        np.fill_diagonal(M, 0)\n        return M\n\n    def _adj_of(M):\n        A = (M != 0) | (M.T != 0)\n        np.fill_diagonal(A, False)\n        return A\n\n    def _is_dir(M, i, j):\n        return M[i, j] == TAIL and M[j, i] == ARROW\n\n    def _run_boss_like():\n        for module in (\"bench.boss\", \"bench.grasp\"):\n            try:\n                mod = __import__(module, fromlist=[\"run_causal_discovery\"])\n                G = mod.run_causal_discovery(X)\n                return G.G if hasattr(G, \"G\") else G\n            except Exception:\n                pass\n        return GeneralGraph(nodes)\n\n    def _run_pc(alpha):\n        try:\n            from causallearn.search.ConstraintBased.PC import pc\n\n            try:\n                out = pc(\n                    X,\n                    alpha=alpha,\n                    indep_test=\"fisherz\",\n                    stable=True,\n                    uc_rule=0,\n                    uc_priority=2,\n                    show_progress=False,\n                )\n            except TypeError:\n                out = pc(X, alpha=alpha, indep_test=\"fisherz\", stable=True, uc_rule=0, uc_priority=2)\n            return out.G if hasattr(out, \"G\") else out\n        except Exception:\n            try:\n                from bench.pc import run_causal_discovery as pc_run\n\n                return pc_run(X)\n            except Exception:\n                return GeneralGraph(nodes)\n\n    ratio = n / max(p, 1)\n    use_pc = p <= 20 or ratio >= 30\n    alpha_pc = 0.12 if ratio < 20 else (0.10 if ratio < 35 else 0.07)\n\n    M_b = _to_mat(_run_boss_like())\n    M_p = _to_mat(_run_pc(alpha_pc)) if use_pc else np.zeros((p, p), dtype=int)\n    A_b, A_p = _adj_of(M_b), _adj_of(M_p)\n\n    Z = X.astype(float) - X.mean(axis=0, keepdims=True)\n    sd = Z.std(axis=0, ddof=1, keepdims=True)\n    sd[sd < 1e-8] = 1.0\n    Z = Z / sd\n    R = (Z.T @ Z) / max(n - 1, 1)\n    R = np.clip((R + R.T) / 2.0, -1.0, 1.0)\n    np.fill_diagonal(R, 1.0)\n    lam = float(min(0.18, max(0.02, p / float(max(n + p, 1)))))\n    R = (1.0 - lam) * R + lam * np.eye(p)\n\n    P = np.linalg.pinv(R)\n    d = np.sqrt(np.maximum(np.diag(P), 1e-12))\n    F = -P / np.outer(d, d)\n    F = np.clip((F + F.T) / 2.0, -0.999999, 0.999999)\n    np.fill_diagonal(F, 0.0)\n\n    score = 0.67 * np.abs(R) + 0.33 * np.abs(F)\n    np.fill_diagonal(score, -1.0)\n    top_m = min(p - 1, 5 if p <= 10 else (7 if p <= 20 else 10))\n    A_scr = np.zeros((p, p), dtype=bool)\n    if top_m > 0:\n        for i in range(p):\n            idx = np.argpartition(score[i], -top_m)[-top_m:]\n            A_scr[i, idx] = True\n    A_scr = A_scr | A_scr.T\n    np.fill_diagonal(A_scr, False)\n\n    r_thr = float(min(0.35, 1.35 * math.sqrt(max(math.log(max(p, 2) + 1.0) / max(n - 1, 1), 1e-12))))\n    f_thr = 0.045 if p <= 20 else 0.055\n    A_add = A_scr & ((np.abs(R) > r_thr) | (np.abs(F) > f_thr))\n    cand = A_b | A_p | A_add\n    np.fill_diagonal(cand, False)\n\n    support = 3 * A_b.astype(int) + 2 * A_p.astype(int) + A_add.astype(int) + (np.abs(F) > (f_thr + 0.02)).astype(int)\n\n    p_cache = {}\n\n    def _fisher_p(i, j, S):\n        a, b = (i, j) if i < j else (j, i)\n        S = tuple(sorted(int(s) for s in S))\n        key = (a, b, S)\n        if key not in p_cache:\n            idx = [a, b] + list(S)\n            M = np.linalg.pinv(R[np.ix_(idx, idx)])\n            denom = max(float(M[0, 0] * M[1, 1]), 1e-12)\n            rho = float(np.clip(-M[0, 1] / math.sqrt(denom), -0.999999, 0.999999))\n            z = abs(float(np.arctanh(rho))) * math.sqrt(max(n - len(S) - 3, 1))\n            p_cache[key] = float(math.erfc(z / math.sqrt(2.0)))\n        return p_cache[key]\n\n    max_k = 3 if p <= 20 else 2\n    pool_cap = 8 if p <= 20 else 6\n\n    def _pool(i, j, adj_ref, forbid=()):\n        forbid = set(forbid)\n        base = [\n            k\n            for k in range(p)\n            if k not in forbid\n            and k not in (i, j)\n            and (adj_ref[i, k] or adj_ref[j, k] or A_b[i, k] or A_b[j, k] or A_p[i, k] or A_p[j, k] or A_add[i, k] or A_add[j, k])\n        ]\n\n        def _rank(k):\n            return (support[i, k] + support[j, k], score[i, k] + score[j, k])\n\n        base = sorted(set(base), key=_rank, reverse=True)\n        return base[: max(pool_cap, max_k + 1)]\n\n    def _edge_alpha(i, j):\n        a = 0.022\n        if A_b[i, j]:\n            a -= 0.008\n        if A_p[i, j]:\n            a -= 0.005\n        if A_add[i, j]:\n            a -= 0.002\n        if abs(F[i, j]) > (f_thr + 0.02):\n            a -= 0.002\n        return float(min(0.024, max(0.003, a)))\n\n    adj = cand.copy()\n    sepsets = {}\n    for ell in range(max_k + 1):\n        snap = adj.copy()\n        edges = [(i, j) for i in range(p) for j in range(i + 1, p) if snap[i, j]]\n        for i, j in edges:\n            pool = _pool(i, j, snap)\n            if len(pool) < ell:\n                continue\n            best = None\n            cutoff = _edge_alpha(i, j)\n            for S in combinations(pool, ell):\n                pv = _fisher_p(i, j, S)\n                if pv > cutoff and (best is None or pv > best[0]):\n                    best = (pv, S)\n            if best is not None:\n                adj[i, j] = adj[j, i] = False\n                sepsets[(i, j)] = tuple(best[1])\n\n    sep_cache = {}\n\n    def _find_sep(i, j, include=(), exclude=()):\n        a, b = (i, j) if i < j else (j, i)\n        include = tuple(sorted(int(x) for x in include if x not in (a, b)))\n        exclude = tuple(sorted(int(x) for x in exclude if x not in (a, b)))\n        key = (a, b, include, exclude)\n        if key in sep_cache:\n            return sep_cache[key]\n        S0 = sepsets.get((a, b))\n        if S0 is not None and all(x in S0 for x in include) and all(x not in S0 for x in exclude):\n            sep_cache[key] = S0\n            return S0\n        if len(include) > max_k:\n            sep_cache[key] = None\n            return None\n        forbid = set(include) | set(exclude)\n        pool = [k for k in _pool(a, b, cand, forbid=forbid) if k not in forbid]\n        cutoff = 0.018\n        base = tuple(sorted(include))\n        for ell in range(len(base), max_k + 1):\n            rem = ell - len(base)\n            if rem < 0 or len(pool) < rem:\n                continue\n            best = None\n            for extra in combinations(pool, rem):\n                S = tuple(sorted(base + extra))\n                pv = _fisher_p(a, b, S)\n                if pv > cutoff and (best is None or pv > best[0]):\n                    best = (pv, S)\n            if best is not None:\n                sep_cache[key] = tuple(best[1])\n                return sep_cache[key]\n        sep_cache[key] = None\n        return None\n\n    dirn = np.zeros((p, p), dtype=bool)\n\n    def _undir(i, j):\n        return adj[i, j] and not dirn[i, j] and not dirn[j, i]\n\n    def _orient(i, j):\n        if not adj[i, j] or dirn[i, j] or dirn[j, i]:\n            return False\n        dirn[i, j] = True\n        dirn[j, i] = False\n        return True\n\n    for i in range(p):\n        for j in range(i + 1, p):\n            if not adj[i, j]:\n                continue\n            if _is_dir(M_b, i, j) and not _is_dir(M_p, j, i):\n                _orient(i, j)\n            elif _is_dir(M_b, j, i) and not _is_dir(M_p, i, j):\n                _orient(j, i)\n            elif _is_dir(M_p, i, j) and _is_dir(M_b, i, j):\n                _orient(i, j)\n            elif _is_dir(M_p, j, i) and _is_dir(M_b, j, i):\n                _orient(j, i)\n\n    for k in range(p):\n        nbrs = [i for i in range(p) if adj[k, i]]\n        for a in range(len(nbrs)):\n            i = nbrs[a]\n            for j in nbrs[a + 1 :]:\n                if adj[i, j]:\n                    continue\n                sep_without = _find_sep(i, j, exclude=(k,))\n                sep_with = _find_sep(i, j, include=(k,))\n                prior = (_is_dir(M_b, i, k) and _is_dir(M_b, j, k)) or (_is_dir(M_p, i, k) and _is_dir(M_p, j, k))\n                if (sep_without is not None and sep_with is None) or prior:\n                    _orient(i, k)\n                    _orient(j, k)\n\n    def _reachability():\n        reach = dirn.copy()\n        for m in range(p):\n            reach = reach | (reach[:, [m]] & reach[[m], :])\n        np.fill_diagonal(reach, False)\n        return reach\n\n    changed = True\n    while changed:\n        changed = False\n        reach = _reachability()\n        todo = []\n        for i in range(p):\n            for j in range(i + 1, p):\n                if not _undir(i, j):\n                    continue\n                if any(dirn[k, i] and not adj[k, j] for k in range(p) if k not in (i, j)):\n                    todo.append((i, j))\n                    continue\n                if any(dirn[k, j] and not adj[k, i] for k in range(p) if k not in (i, j)):\n                    todo.append((j, i))\n                    continue\n                if any(dirn[i, k] and dirn[k, j] for k in range(p) if k not in (i, j)):\n             \n... (truncated)",
5  "after_line": 0,
6  "start_line": 3,
7  "end_line": 266
8}
Tool ResultStep 3
OK: Replaced lines 3..266 with 319 line(s) in causal-learn/bench/custom_algorithm.py. Editable range: 3–321.

[Current file: causal-learn/bench/custom_algorithm.py | editable: 3–321 | total: 322 lines]
     1| import numpy as np
     2| from causallearn.graph.GeneralGraph import GeneralGraph
     3  from causallearn.graph.GraphNode import GraphNode
     4  
     5  # =====================================================================
     6  # EDITABLE: implement run_causal_discovery below
     7  # =====================================================================
     8  def run_causal_discovery(X: np.ndarray) -> GeneralGraph:
     9      """
    10      Screened BOSS-PC hybrid:
    11      1) obtain an order-search CPDAG proposal (BOSS/GRaSP fallback),
    12      2) add a high-recall shrinkage-screened skeleton (and PC when affordable),
    13      3) prune by restricted Fisher-Z tests with edge-adaptive cutoffs,
    14      4) orient using baseline directional priors, conserv
AssistantStep 4test
Tool ResultStep 4
[Test #2] (1 test remaining; call submit(n=2) to submit this result)

### ER10 (scripts/eval_er10.sh)
Results (ER10):
  SHD=9  AdjP=1.0000 AdjR=0.7857  ArrowP=0.4545 ArrowR=0.4545

### ER20 (scripts/eval_er20.sh)
Results (ER20):
  SHD=29  AdjP=1.0000 AdjR=0.5610  ArrowP=0.5385 ArrowR=0.2059

### SF50 (scripts/eval_sf50.sh)
Results (SF50):
  SHD=122  AdjP=0.5182 AdjR=0.5938  ArrowP=0.2500 ArrowR=0.3649

### ER10-Hard (scripts/eval_er10_hard.sh)
Results (ER10-Hard):
  SHD=23  AdjP=1.0000 AdjR=0.2692  ArrowP=1.0000 ArrowR=0.1154

### ER20-Hard (scripts/eval_er20_hard.sh)
Results (ER20-Hard):
  SHD=66  AdjP=0.8421 AdjR=0.2319  ArrowP=0.8333 ArrowR=0.0735

### SF50-Hard (scripts/eval_sf50_hard.sh)
Results (SF50-Hard):
  SHD=202  AdjP=0.3857 AdjR=0.3830  ArrowP=0.1786 ArrowR=0.1812

### ER10-Noisy (scripts/eval_er10_noisy.sh)
Results (ER10-Noisy):
  SHD=23  AdjP=1.0000 AdjR=0.2692  ArrowP=1.0000 ArrowR=0.1154

### ER20-Noisy (scripts/eval_er20_noisy.sh)
Results (ER20-Noisy):
  SHD=66  AdjP=0
AssistantStep 5submit
Tool Input
1{
2  "n": 1
3}
Tool ResultStep 5
[submit] Submitting result from test #1 as final.

### ER10 (scripts/eval_er10.sh)
Results (ER10):
  SHD=11  AdjP=1.0000 AdjR=0.7857  ArrowP=0.2727 ArrowR=0.2727

### ER20 (scripts/eval_er20.sh)
Results (ER20):
  SHD=38  AdjP=0.9583 AdjR=0.5610  ArrowP=0.1304 ArrowR=0.0882

### SF50 (scripts/eval_sf50.sh)
Results (SF50):
  SHD=91  AdjP=0.8966 AdjR=0.5417  ArrowP=0.1964 ArrowR=0.1486

### ER10-Hard (scripts/eval_er10_hard.sh)
Results (ER10-Hard):
  SHD=22  AdjP=1.0000 AdjR=0.3077  ArrowP=0.5714 ArrowR=0.1538

### ER20-Hard (scripts/eval_er20_hard.sh)
Results (ER20-Hard):
  SHD=71  AdjP=0.8400 AdjR=0.3043  ArrowP=0.0800 ArrowR=0.0294

### SF50-Hard (scripts/eval_sf50_hard.sh)

  0%|          | 0/50 [00:00<?, ?it/s]
  0%|          | 0/50 [00:00<?, ?it/s]
Depth=0, working on node 0:   2%|▏         | 1/50 [00:00<00:00, 6017.65it/s]
Depth=0, working on node 1:   4%|▍         | 2/50 [00:00<00:00, 404.60it/s] 
Depth=0, working on node 2:   6%|▌         | 3/50 [00:00<00:00, 329.80it/s]
Depth=0,