Agent Conversation: google/gemini-3.1-pro-preview
Model: google/gemini-3.1-pro-preview
Total steps: 9
Messages: 24
UserStep 0
Task: causal-observational-linear-gaussian
Causal Discovery: Observational Linear Gaussian Data (CPDAG Recovery)
Objective
Implement a causal discovery algorithm that recovers the CPDAG from purely observational data generated by a linear Gaussian SEM. Your code goes in bench/custom_algorithm.py.
Background
Under linear Gaussian assumptions, observational data can only identify the Markov Equivalence Class (MEC), not a unique fully directed DAG. The MEC is represented by a CPDAG.
Interface
Your run_causal_discovery(X) implementation must return a causallearn.graph.GeneralGraph.GeneralGraph object representing the estimated CPDAG.
Evaluation Scenarios
| Label | Graph type | Nodes | Params | Samples | Noise |
|---|---|---|---|---|---|
| ER10 | Erdos-Renyi | 10 | p=0.3 | 500 | 1.0 |
| ER20 | Erdos-Renyi | 20 | p=0.2 | 1000 | 1.0 |
| SF50 | Scale-Free (BA) | 50 | m=2 | 2000 | 1.0 |
| ER10-Hard | Erdos-Renyi | 10 | p=0.5 (denser) | 200 | 1.0 |
| ER20-Hard | Erdos-Renyi | 20 | p=0.35 (denser) | 400 | 1.0 |
| SF50-Hard | Scale-Free (BA) | 50 | m=3 (denser) | 1000 | 1.0 |
| ER10-Noisy | Erdos-Renyi | 10 | p=0.5 (denser) | 200 | 2.5 |
| ER20-Noisy | Erdos-Renyi | 20 | p=0.35 (denser) | 400 | 2.5 |
| SF50-Noisy | Scale-Free (BA) | 50 | m=3 (denser) | 1000 | 2.5 |
Metrics
Metrics are computed between estimated CPDAG and ground-truth CPDAG:
- SHD (
SHD(...).get_shd()) - Adjacency Precision / Recall (
AdjacencyConfusion) - Arrow Precision / Recall (
ArrowConfusion)
Baselines
pc: Peter-Clark (constraint-based)ges: Greedy Equivalence Search (score-based)grasp: Greedy Relaxations of the Sparsest Permutationboss: Best Order Score Search
causal-learn/bench/custom_algorithm.py [EDITABLE — lines 3–14 only]
1: import numpy as np
2: from causallearn.graph.GeneralGraph import GeneralGraph
3: from causallearn.graph.GraphNode import GraphNode
4:
5: # =====================================================================
6: # EDITABLE: implement run_causal_discovery below
7: # =====================================================================
8: def run_causal_discovery(X: np.ndarray) -> GeneralGraph:
9: """
10: Input: X of shape (n_samples, n_variables)
11: Output: estimated CPDAG as causallearn.graph.GeneralGraph.GeneralGraph
12: """
13: nodes = [GraphNode(f"X{i + 1}") for i in range(X.shape[1])]
14: return GeneralGraph(nodes)
15: # =====================================================================
pc baseline — editable region [READ-ONLY — reference implementation]
Lines 3–14:
1: import numpy as np
2: from causallearn.graph.GeneralGraph import GeneralGraph
3: from causallearn.graph.GraphNode import GraphNode
4:
5: # =====================================================================
6: # EDITABLE: impAssistantStep 1edit
Tool Input
1{
2 "filename": "bench/custom_algorithm.py",
3 "start_line": 5,
4 "content": "# =====================================================================\n# EDITABLE: implement run_causal_discovery below\n# =====================================================================\nimport itertools\nimport math\nfrom causallearn.utils.DAG2CPDAG import dag2cpdag\nfrom causallearn.graph.Edge import Edge\nfrom causallearn.graph.Endpoint import Endpoint\nfrom causallearn.utils.PCUtils import SkeletonDiscovery\nfrom causallearn.utils.cit import CIT\n\ndef run_causal_discovery(X: np.ndarray) -> GeneralGraph:\n n, d = X.shape\n nodes = [GraphNode(f\"X{i + 1}\") for i in range(d)]\n C = np.cov(X.T, bias=True)\n \n indep_test = CIT(X, \"fisherz\")\n try:\n cg = SkeletonDiscovery.skeleton_discovery(X, 0.1, indep_test)\n skel = cg.G.graph\n except:\n skel = np.ones((d, d))\n \n C_corr = np.corrcoef(X.T)\n cand = []\n for i in range(d):\n c_i = set()\n for j in range(d):\n if skel[i, j] != 0: c_i.add(j)\n top_j = np.argsort(-np.abs(C_corr[i]))\n added = 0\n for j in top_j:\n if j == i: continue\n if added >= 5: break\n c_i.add(j)\n added += 1\n cand.append(tuple(sorted(list(c_i))))\n \n parent_cache = {}\n \n def calc_ebic(target, S):\n sz = len(S)\n if sz == 0:\n var_res = C[target, target]\n if var_res <= 0: var_res = 1e-8\n return n * np.log(var_res)\n \n Css = C[np.ix_(S, S)] + np.eye(sz) * 1e-6\n Csy = C[S, target]\n try:\n beta = np.linalg.solve(Css, Csy)\n except:\n beta = np.linalg.lstsq(Css, Csy, rcond=None)[0]\n var_res = C[target, target] - np.dot(C[target, S], beta)\n if var_res <= 0: var_res = 1e-8\n \n gamma = 0.5\n bic_pen = sz * np.log(n)\n if gamma > 0:\n bic_pen += 2 * gamma * np.log(math.comb(d, sz))\n return n * np.log(var_res) + bic_pen\n\n def get_best_parents(target, avail_candidates):\n if len(avail_candidates) == 0:\n sc = calc_ebic(target, [])\n parent_cache[(target, avail_candidates)] = ((), sc)\n return (), sc\n if (target, avail_candidates) in parent_cache:\n return parent_cache[(target, avail_candidates)]\n \n if len(avail_candidates) <= 8:\n best_S, best_score = (), float('inf')\n for r in range(len(avail_candidates) + 1):\n for combo in itertools.combinations(avail_candidates, r):\n sc = calc_ebic(target, list(combo))\n if sc < best_score:\n best_score, best_S = sc, combo\n res = (tuple(sorted(best_S)), best_score)\n parent_cache[(target, avail_candidates)] = res\n return res\n else:\n current_S = []\n current_score = calc_ebic(target, current_S)\n changed = True\n while changed:\n changed = False\n best_fwd_item, best_fwd_score = None, current_score\n for c in avail_candidates:\n if c not in current_S:\n sc = calc_ebic(target, current_S + [c])\n if sc < best_fwd_score:\n best_fwd_score, best_fwd_item = sc, c\n if best_fwd_item is not None:\n current_S.append(best_fwd_item)\n current_score = best_fwd_score\n changed = True\n if len(current_S) > 1:\n best_bwd_item, best_bwd_score = None, current_score\n for c in current_S:\n test_S = [x for x in current_S if x != c]\n sc = calc_ebic(target, test_S)\n if sc < best_bwd_score:\n best_bwd_score, best_bwd_item = sc, c\n if best_bwd_item is not None:\n current_S.remove(best_bwd_item)\n current_score = best_bwd_score\n changed = True\n res = (tuple(sorted(current_S)), current_score)\n parent_cache[(target, avail_candidates)] = res\n return res\n\n def score_order(order):\n total_score = 0\n dag = {}\n for pos, i in enumerate(order):\n pred = order[:pos]\n avail = tuple(sorted([x for x in pred if x in cand[i]]))\n best_S, sc = get_best_parents(i, avail)\n total_score += sc\n dag[i] = best_S\n return total_score, dag\n\n def hc(order):\n best_order = list(order)\n best_score, best_dag = score_order(best_order)\n while True:\n improved = False\n for i in range(d):\n node = best_order[i]\n for j in range(d):\n if i == j: continue\n new_order = best_order[:i] + best_order[i+1:]\n new_order.insert(j, node)\n sc, dag = score_order(new_order)\n if sc < best_score:\n best_score, best_order, best_dag = sc, new_order, dag\n improved = True\n if not improved:\n break\n return best_order, best_score, best_dag\n\n variances = np.diag(C)\n restarts = [list(np.argsort(variances)), list(np.argsort(-variances))]\n for _ in range(8):\n rand_o = list(range(d))\n np.random.shuffle(rand_o)\n restarts.append(rand_o)\n \n global_best_score = float('inf')\n global_best_dag = None\n for init_order in restarts:\n _, sc, dag = hc(init_order)\n if sc < global_best_score:\n global_best_score, global_best_dag = sc, dag\n\n dag_graph = GeneralGraph(nodes)\n for i in range(d):\n for p in global_best_dag[i]:\n edge = Edge(nodes[p], nodes[i], Endpoint.TAIL, Endpoint.ARROW)\n dag_graph.add_edge(edge)\n\n try:\n cpdag = dag2cpdag(dag_graph)\n except:\n cpdag = dag_graph\n return cpdag",
5 "end_line": 14,
6 "op": "replace"
7}Tool ResultStep 1
ERROR: Package 'bench' is not in allowed packages
AssistantStep 2edit
Tool Input
1{
2 "op": "replace",
3 "content": "# =====================================================================\n# EDITABLE: implement run_causal_discovery below\n# =====================================================================\nimport itertools\nimport math\nfrom causallearn.utils.DAG2CPDAG import dag2cpdag\nfrom causallearn.graph.Edge import Edge\nfrom causallearn.graph.Endpoint import Endpoint\nfrom causallearn.utils.PCUtils import SkeletonDiscovery\nfrom causallearn.utils.cit import CIT\n\ndef run_causal_discovery(X: np.ndarray) -> GeneralGraph:\n n, d = X.shape\n nodes = [GraphNode(f\"X{i + 1}\") for i in range(d)]\n C = np.cov(X.T, bias=True)\n \n indep_test = CIT(X, \"fisherz\")\n try:\n cg = SkeletonDiscovery.skeleton_discovery(X, 0.1, indep_test)\n skel = cg.G.graph\n except:\n skel = np.ones((d, d))\n \n C_corr = np.corrcoef(X.T)\n cand = []\n for i in range(d):\n c_i = set()\n for j in range(d):\n if skel[i, j] != 0: c_i.add(j)\n top_j = np.argsort(-np.abs(C_corr[i]))\n added = 0\n for j in top_j:\n if j == i: continue\n if added >= 5: break\n c_i.add(j)\n added += 1\n cand.append(tuple(sorted(list(c_i))))\n \n parent_cache = {}\n \n def calc_ebic(target, S):\n sz = len(S)\n if sz == 0:\n var_res = C[target, target]\n if var_res <= 0: var_res = 1e-8\n return n * np.log(var_res)\n \n Css = C[np.ix_(S, S)] + np.eye(sz) * 1e-6\n Csy = C[S, target]\n try:\n beta = np.linalg.solve(Css, Csy)\n except:\n beta = np.linalg.lstsq(Css, Csy, rcond=None)[0]\n var_res = C[target, target] - np.dot(C[target, S], beta)\n if var_res <= 0: var_res = 1e-8\n \n gamma = 0.5\n bic_pen = sz * np.log(n)\n if gamma > 0:\n bic_pen += 2 * gamma * np.log(math.comb(d, sz))\n return n * np.log(var_res) + bic_pen\n\n def get_best_parents(target, avail_candidates):\n if len(avail_candidates) == 0:\n sc = calc_ebic(target, [])\n parent_cache[(target, avail_candidates)] = ((), sc)\n return (), sc\n if (target, avail_candidates) in parent_cache:\n return parent_cache[(target, avail_candidates)]\n \n if len(avail_candidates) <= 8:\n best_S, best_score = (), float('inf')\n for r in range(len(avail_candidates) + 1):\n for combo in itertools.combinations(avail_candidates, r):\n sc = calc_ebic(target, list(combo))\n if sc < best_score:\n best_score, best_S = sc, combo\n res = (tuple(sorted(best_S)), best_score)\n parent_cache[(target, avail_candidates)] = res\n return res\n else:\n current_S = []\n current_score = calc_ebic(target, current_S)\n changed = True\n while changed:\n changed = False\n best_fwd_item, best_fwd_score = None, current_score\n for c in avail_candidates:\n if c not in current_S:\n sc = calc_ebic(target, current_S + [c])\n if sc < best_fwd_score:\n best_fwd_score, best_fwd_item = sc, c\n if best_fwd_item is not None:\n current_S.append(best_fwd_item)\n current_score = best_fwd_score\n changed = True\n if len(current_S) > 1:\n best_bwd_item, best_bwd_score = None, current_score\n for c in current_S:\n test_S = [x for x in current_S if x != c]\n sc = calc_ebic(target, test_S)\n if sc < best_bwd_score:\n best_bwd_score, best_bwd_item = sc, c\n if best_bwd_item is not None:\n current_S.remove(best_bwd_item)\n current_score = best_bwd_score\n changed = True\n res = (tuple(sorted(current_S)), current_score)\n parent_cache[(target, avail_candidates)] = res\n return res\n\n def score_order(order):\n total_score = 0\n dag = {}\n for pos, i in enumerate(order):\n pred = order[:pos]\n avail = tuple(sorted([x for x in pred if x in cand[i]]))\n best_S, sc = get_best_parents(i, avail)\n total_score += sc\n dag[i] = best_S\n return total_score, dag\n\n def hc(order):\n best_order = list(order)\n best_score, best_dag = score_order(best_order)\n while True:\n improved = False\n for i in range(d):\n node = best_order[i]\n for j in range(d):\n if i == j: continue\n new_order = best_order[:i] + best_order[i+1:]\n new_order.insert(j, node)\n sc, dag = score_order(new_order)\n if sc < best_score:\n best_score, best_order, best_dag = sc, new_order, dag\n improved = True\n if not improved:\n break\n return best_order, best_score, best_dag\n\n variances = np.diag(C)\n restarts = [list(np.argsort(variances)), list(np.argsort(-variances))]\n for _ in range(8):\n rand_o = list(range(d))\n np.random.shuffle(rand_o)\n restarts.append(rand_o)\n \n global_best_score = float('inf')\n global_best_dag = None\n for init_order in restarts:\n _, sc, dag = hc(init_order)\n if sc < global_best_score:\n global_best_score, global_best_dag = sc, dag\n\n dag_graph = GeneralGraph(nodes)\n for i in range(d):\n for p in global_best_dag[i]:\n edge = Edge(nodes[p], nodes[i], Endpoint.TAIL, Endpoint.ARROW)\n dag_graph.add_edge(edge)\n\n try:\n cpdag = dag2cpdag(dag_graph)\n except:\n cpdag = dag_graph\n return cpdag",
4 "filename": "causal-learn/bench/custom_algorithm.py",
5 "end_line": 14,
6 "start_line": 5
7}Tool ResultStep 2
OK: Replaced lines 5..14 with 166 line(s) in causal-learn/bench/custom_algorithm.py. Editable range: 3–170.
[Current file: causal-learn/bench/custom_algorithm.py | editable: 3–170 | total: 171 lines]
1| import numpy as np
2| from causallearn.graph.GeneralGraph import GeneralGraph
3 from causallearn.graph.GraphNode import GraphNode
4
5 # =====================================================================
6 # EDITABLE: implement run_causal_discovery below
7 # =====================================================================
8 import itertools
9 import math
10 from causallearn.utils.DAG2CPDAG import dag2cpdag
11 from causallearn.graph.Edge import Edge
12 from causallearn.graph.Endpoint import Endpoint
13 from causallearn.utils.PCUtils import SkeletonDiscovery
14 from causallearn.utils.cit import CIT
15
16 def run_causal_discovery(X: np.ndarray) -> GeneralGraph:
17 n, d = X.shape
18
AssistantStep 3test
AssistantStep 1edit
Tool Input
1{
2 "end_line": 15,
3 "content": "import numpy as np\nimport random\nfrom causallearn.graph.GeneralGraph import GeneralGraph\nfrom causallearn.graph.GraphNode import GraphNode\nfrom causallearn.utils.DAG2CPDAG import dag2cpdag\n\n# =====================================================================\n# EDITABLE: implement run_causal_discovery below\n# =====================================================================\ndef run_causal_discovery(X: np.ndarray) -> GeneralGraph:\n \"\"\"\n Input: X of shape (n_samples, n_variables)\n Output: estimated CPDAG as causallearn.graph.GeneralGraph.GeneralGraph\n \"\"\"\n n, d = X.shape\n nodes = [GraphNode(f\"X{i + 1}\") for i in range(d)]\n \n # 1. Compute empirical covariance and precision-based penalty matrix\n X_centered = X - np.mean(X, axis=0)\n cov = X_centered.T @ X_centered / n\n inv_cov = np.linalg.inv(cov + np.eye(d) * 1e-4) # slight ridge\n \n R = np.zeros((d, d))\n for i in range(d):\n for j in range(d):\n R[i, j] = np.abs(inv_cov[i, j]) / max(np.sqrt(inv_cov[i, i] * inv_cov[j, j]), 1e-10)\n \n # Soft constraint penalty matrix\n # Relaxes BIC for strong partial correlations, penalizes hard for weak ones\n penalty_mat = np.log(n) * 2.0 * np.exp(-10.0 * R)\n \n # 2. Local score function\n def calc_score(i, P):\n if len(P) == 0:\n v = cov[i, i]\n if v < 1e-10: v = 1e-10\n return n * np.log(v)\n \n cov_P = cov[np.ix_(P, P)]\n cov_iP = cov[i, P]\n inv_cov_P = np.linalg.pinv(cov_P)\n v = cov[i, i] - cov_iP @ inv_cov_P @ cov_iP\n if v < 1e-10: v = 1e-10\n \n # Modified BIC: 0.5 * log(n) base penalty + precision penalty\n base_pen = len(P) * np.log(n) * 0.5\n prec_pen = np.sum(penalty_mat[i, P])\n \n return n * np.log(v) + base_pen + prec_pen\n \n # 3. Cache and Greedy Parent Search\n cache = {}\n def get_best_parents(i, predecessors):\n fs = frozenset(predecessors)\n if (i, fs) in cache:\n return cache[(i, fs)]\n \n current_P = []\n best_score = calc_score(i, current_P)\n \n while len(current_P) < min(15, len(predecessors)):\n best_next_p = None\n best_next_score = best_score\n \n for p in predecessors:\n if p not in current_P:\n cand_P = current_P + [p]\n score = calc_score(i, cand_P)\n if score < best_next_score - 1e-5:\n best_next_score = score\n best_next_p = p\n \n if best_next_p is not None:\n current_P.append(best_next_p)\n best_score = best_next_score\n else:\n break\n \n # Backward elimination\n while len(current_P) > 0:\n best_rem_p = None\n best_rem_score = best_score\n for p in current_P:\n cand_P = [x for x in current_P if x != p]\n score = calc_score(i, cand_P)\n if score < best_rem_score - 1e-5:\n best_rem_score = score\n best_rem_p = p\n if best_rem_p is not None:\n current_P.remove(best_rem_p)\n best_score = best_rem_score\n else:\n break\n \n res = (current_P, best_score)\n cache[(i, fs)] = res\n return res\n\n def score_order(order):\n total_score = 0\n dag_parents = []\n for idx, i in enumerate(order):\n predecessors = order[:idx]\n P, p_score = get_best_parents(i, predecessors)\n total_score += p_score\n dag_parents.append(P)\n return total_score, dag_parents\n\n best_global_score = float('inf')\n best_global_order = None\n best_global_parents = None\n\n num_restarts = 4\n num_steps = 400\n \n for restart in range(num_restarts):\n if restart == 0:\n current_order = np.argsort(np.diag(cov)).tolist()\n else:\n current_order = list(range(d))\n random.shuffle(current_order)\n \n current_score, current_parents = score_order(current_order)\n \n for step in range(num_steps):\n new_order = current_order.copy()\n if random.random() < 0.5:\n # swap\n idx1, idx2 = random.sample(range(d), 2)\n new_order[idx1], new_order[idx2] = new_order[idx2], new_order[idx1]\n else:\n # shift\n idx1, idx2 = random.sample(range(d), 2)\n val = new_order.pop(idx1)\n new_order.insert(idx2, val)\n \n new_score, new_parents = score_order(new_order)\n \n # Simulated Annealing acceptance\n temp = 10.0 * (1.0 - step / num_steps)\n if temp < 1e-3: temp = 1e-3\n \n diff = current_score - new_score\n if diff > 0 or random.random() < np.exp(diff / temp):\n current_score = new_score\n current_order = new_order\n current_parents = new_parents\n \n if current_score < best_global_score:\n best_global_score = current_score\n best_global_order = current_order\n best_global_parents = current_parents\n\n # 5. Build Final DAG\n graph = GeneralGraph(nodes)\n final_adjacency = np.zeros((d, d))\n for idx, i in enumerate(best_global_order):\n for p in best_global_parents[idx]:\n final_adjacency[p, i] = 1\n \n for i in range(d):\n for j in range(d):\n if final_adjacency[i, j] == 1:\n graph.add_directed_edge(nodes[i], nodes[j])\n \n # 6. Convert to CPDAG\n cpdag = dag2cpdag(graph)\n return cpdag\n# =====================================================================",
4 "filename": "bench/custom_algorithm.py",
5 "op": "replace",
6 "start_line": 1
7}Tool ResultStep 1
ERROR: Package 'bench' is not in allowed packages
AssistantStep 2edit
Tool Input
1{
2 "end_line": 15,
3 "start_line": 1,
4 "op": "replace",
5 "content": "import numpy as np\nimport random\nfrom causallearn.graph.GeneralGraph import GeneralGraph\nfrom causallearn.graph.GraphNode import GraphNode\nfrom causallearn.utils.DAG2CPDAG import dag2cpdag\n\n# =====================================================================\n# EDITABLE: implement run_causal_discovery below\n# =====================================================================\ndef run_causal_discovery(X: np.ndarray) -> GeneralGraph:\n \"\"\"\n Input: X of shape (n_samples, n_variables)\n Output: estimated CPDAG as causallearn.graph.GeneralGraph.GeneralGraph\n \"\"\"\n n, d = X.shape\n nodes = [GraphNode(f\"X{i + 1}\") for i in range(d)]\n \n # 1. Compute empirical covariance and precision-based penalty matrix\n X_centered = X - np.mean(X, axis=0)\n cov = X_centered.T @ X_centered / n\n inv_cov = np.linalg.inv(cov + np.eye(d) * 1e-4) # slight ridge\n \n R = np.zeros((d, d))\n for i in range(d):\n for j in range(d):\n R[i, j] = np.abs(inv_cov[i, j]) / max(np.sqrt(inv_cov[i, i] * inv_cov[j, j]), 1e-10)\n \n # Soft constraint penalty matrix\n # Relaxes BIC for strong partial correlations, penalizes hard for weak ones\n penalty_mat = np.log(n) * 2.0 * np.exp(-10.0 * R)\n \n # 2. Local score function\n def calc_score(i, P):\n if len(P) == 0:\n v = cov[i, i]\n if v < 1e-10: v = 1e-10\n return n * np.log(v)\n \n cov_P = cov[np.ix_(P, P)]\n cov_iP = cov[i, P]\n inv_cov_P = np.linalg.pinv(cov_P)\n v = cov[i, i] - cov_iP @ inv_cov_P @ cov_iP\n if v < 1e-10: v = 1e-10\n \n # Modified BIC: 0.5 * log(n) base penalty + precision penalty\n base_pen = len(P) * np.log(n) * 0.5\n prec_pen = np.sum(penalty_mat[i, P])\n \n return n * np.log(v) + base_pen + prec_pen\n \n # 3. Cache and Greedy Parent Search\n cache = {}\n def get_best_parents(i, predecessors):\n fs = frozenset(predecessors)\n if (i, fs) in cache:\n return cache[(i, fs)]\n \n current_P = []\n best_score = calc_score(i, current_P)\n \n while len(current_P) < min(15, len(predecessors)):\n best_next_p = None\n best_next_score = best_score\n \n for p in predecessors:\n if p not in current_P:\n cand_P = current_P + [p]\n score = calc_score(i, cand_P)\n if score < best_next_score - 1e-5:\n best_next_score = score\n best_next_p = p\n \n if best_next_p is not None:\n current_P.append(best_next_p)\n best_score = best_next_score\n else:\n break\n \n # Backward elimination\n while len(current_P) > 0:\n best_rem_p = None\n best_rem_score = best_score\n for p in current_P:\n cand_P = [x for x in current_P if x != p]\n score = calc_score(i, cand_P)\n if score < best_rem_score - 1e-5:\n best_rem_score = score\n best_rem_p = p\n if best_rem_p is not None:\n current_P.remove(best_rem_p)\n best_score = best_rem_score\n else:\n break\n \n res = (current_P, best_score)\n cache[(i, fs)] = res\n return res\n\n def score_order(order):\n total_score = 0\n dag_parents = []\n for idx, i in enumerate(order):\n predecessors = order[:idx]\n P, p_score = get_best_parents(i, predecessors)\n total_score += p_score\n dag_parents.append(P)\n return total_score, dag_parents\n\n best_global_score = float('inf')\n best_global_order = None\n best_global_parents = None\n\n num_restarts = 4\n num_steps = 400\n \n for restart in range(num_restarts):\n if restart == 0:\n current_order = np.argsort(np.diag(cov)).tolist()\n else:\n current_order = list(range(d))\n random.shuffle(current_order)\n \n current_score, current_parents = score_order(current_order)\n \n for step in range(num_steps):\n new_order = current_order.copy()\n if random.random() < 0.5:\n # swap\n idx1, idx2 = random.sample(range(d), 2)\n new_order[idx1], new_order[idx2] = new_order[idx2], new_order[idx1]\n else:\n # shift\n idx1, idx2 = random.sample(range(d), 2)\n val = new_order.pop(idx1)\n new_order.insert(idx2, val)\n \n new_score, new_parents = score_order(new_order)\n \n # Simulated Annealing acceptance\n temp = 10.0 * (1.0 - step / num_steps)\n if temp < 1e-3: temp = 1e-3\n \n diff = current_score - new_score\n if diff > 0 or random.random() < np.exp(diff / temp):\n current_score = new_score\n current_order = new_order\n current_parents = new_parents\n \n if current_score < best_global_score:\n best_global_score = current_score\n best_global_order = current_order\n best_global_parents = current_parents\n\n # 5. Build Final DAG\n graph = GeneralGraph(nodes)\n final_adjacency = np.zeros((d, d))\n for idx, i in enumerate(best_global_order):\n for p in best_global_parents[idx]:\n final_adjacency[p, i] = 1\n \n for i in range(d):\n for j in range(d):\n if final_adjacency[i, j] == 1:\n graph.add_directed_edge(nodes[i], nodes[j])\n \n # 6. Convert to CPDAG\n cpdag = dag2cpdag(graph)\n return cpdag\n# =====================================================================",
6 "filename": "causal-learn/bench/custom_algorithm.py"
7}Tool ResultStep 2
ERROR: Lines 1..15 exceed the editable range. You may only edit lines 3–14.
[Current file: causal-learn/bench/custom_algorithm.py | editable: 3–14 | total: 15 lines]
1| import numpy as np
2| from causallearn.graph.GeneralGraph import GeneralGraph
3 from causallearn.graph.GraphNode import GraphNode
4
5 # =====================================================================
6 # EDITABLE: implement run_causal_discovery below
7 # =====================================================================
8 def run_causal_discovery(X: np.ndarray) -> GeneralGraph:
9 """
10 Input: X of shape (n_samples, n_variables)
11 Output: estimated CPDAG as causallearn.graph.GeneralGraph.GeneralGraph
12 """
13 nodes = [GraphNode(f"X{i + 1}") for i in range(X.shape[1])]
14 return GeneralGraph(nodes)
15| # =====================================================================AssistantStep 3edit
Tool Input
1{
2 "start_line": 3,
3 "op": "replace",
4 "content": "from causallearn.graph.GraphNode import GraphNode\nfrom causallearn.utils.DAG2CPDAG import dag2cpdag\nfrom causallearn.score.LocalScoreFunction import local_score_BIC_from_cov\nfrom causallearn.utils.GESUtils import Combinatorial, score_g\nimport random\n\n# =====================================================================\n# EDITABLE: implement run_causal_discovery below\n# =====================================================================\ndef run_causal_discovery(X: np.ndarray) -> GeneralGraph:\n \"\"\"\n Input: X of shape (n_samples, n_variables)\n Output: estimated CPDAG as causallearn.graph.GeneralGraph.GeneralGraph\n \"\"\"\n n, d = X.shape\n nodes = [GraphNode(f\"X{i + 1}\") for i in range(d)]\n \n # 1. Compute empirical covariance and precision-based penalty matrix\n X_centered = X - np.mean(X, axis=0)\n cov = X_centered.T @ X_centered / n\n inv_cov = np.linalg.pinv(cov)\n \n R = np.zeros((d, d))\n for i in range(d):\n for j in range(d):\n R[i, j] = np.abs(inv_cov[i, j]) / (np.sqrt(inv_cov[i, i] * inv_cov[j, j]) + 1e-10)\n \n # Soft constraint penalty matrix\n # Relaxes BIC for strong partial correlations, penalizes hard for weak ones\n penalty_mat = np.log(n) * 2.0 * np.exp(-10.0 * R)\n \n # 2. Local score function\n def calc_score(i, P):\n if len(P) == 0:\n v = cov[i, i]\n if v < 1e-10: v = 1e-10\n return n * np.log(v)\n \n cov_P = cov[np.ix_(P, P)]\n cov_iP = cov[i, P]\n inv_cov_P = np.linalg.pinv(cov_P)\n v = cov[i, i] - cov_iP @ inv_cov_P @ cov_iP\n if v < 1e-10: v = 1e-10\n \n # Modified BIC: 0.5 * log(n) base penalty + precision penalty\n base_pen = len(P) * np.log(n) * 0.5\n prec_pen = np.sum(penalty_mat[i, P])\n \n return n * np.log(v) + base_pen + prec_pen\n \n # 3. Cache and Greedy Parent Search\n cache = {}\n def get_best_parents(i, predecessors):\n fs = frozenset(predecessors)\n if (i, fs) in cache:\n return cache[(i, fs)]\n \n current_P = []\n best_score = calc_score(i, current_P)\n \n while len(current_P) < min(15, len(predecessors)):\n best_next_p = None\n best_next_score = best_score\n \n for p in predecessors:\n if p not in current_P:\n cand_P = current_P + [p]\n score = calc_score(i, cand_P)\n if score < best_next_score - 1e-5:\n best_next_score = score\n best_next_p = p\n \n if best_next_p is not None:\n current_P.append(best_next_p)\n best_score = best_next_score\n else:\n break\n \n # Backward elimination\n while len(current_P) > 0:\n best_rem_p = None\n best_rem_score = best_score\n for p in current_P:\n cand_P = [x for x in current_P if x != p]\n score = calc_score(i, cand_P)\n if score < best_rem_score - 1e-5:\n best_rem_score = score\n best_rem_p = p\n if best_rem_p is not None:\n current_P.remove(best_rem_p)\n best_score = best_rem_score\n else:\n break\n \n res = (current_P, best_score)\n cache[(i, fs)] = res\n return res\n\n def score_order(order):\n total_score = 0\n dag_parents = []\n for idx, i in enumerate(order):\n predecessors = order[:idx]\n P, p_score = get_best_parents(i, predecessors)\n total_score += p_score\n dag_parents.append(P)\n return total_score, dag_parents\n\n best_global_score = float('inf')\n best_global_order = None\n best_global_parents = None\n\n num_restarts = 4\n num_steps = 400\n \n for restart in range(num_restarts):\n if restart == 0:\n current_order = np.argsort(np.diag(cov)).tolist()\n else:\n current_order = list(range(d))\n random.shuffle(current_order)\n \n current_score, current_parents = score_order(current_order)\n \n for step in range(num_steps):\n new_order = current_order.copy()\n if random.random() < 0.5:\n # swap\n idx1, idx2 = random.sample(range(d), 2)\n new_order[idx1], new_order[idx2] = new_order[idx2], new_order[idx1]\n else:\n # shift\n idx1, idx2 = random.sample(range(d), 2)\n val = new_order.pop(idx1)\n new_order.insert(idx2, val)\n \n new_score, new_parents = score_order(new_order)\n \n # Simulated Annealing acceptance\n temp = 10.0 * (1.0 - step / num_steps)\n if temp < 1e-3: temp = 1e-3\n \n diff = current_score - new_score\n if diff > 0 or random.random() < np.exp(diff / temp):\n current_score = new_score\n current_order = new_order\n current_parents = new_parents\n \n if current_score < best_global_score:\n best_global_score = current_score\n best_global_order = current_order\n best_global_parents = current_parents\n\n # 5. Build Final DAG\n graph = GeneralGraph(nodes)\n final_adjacency = np.zeros((d, d))\n for idx, i in enumerate(best_global_order):\n for p in best_global_parents[idx]:\n final_adjacency[p, i] = 1\n \n for i in range(d):\n for j in range(d):\n if final_adjacency[i, j] == 1:\n graph.add_directed_edge(nodes[i], nodes[j])\n \n # 6. Convert to CPDAG\n cpdag = dag2cpdag(graph)\n return cpdag",
5 "end_line": 14,
6 "filename": "causal-learn/bench/custom_algorithm.py"
7}Tool ResultStep 3
OK: Replaced lines 3..14 with 168 line(s) in causal-learn/bench/custom_algorithm.py. Editable range: 3–170.
[Current file: causal-learn/bench/custom_algorithm.py | editable: 3–170 | total: 171 lines]
1| import numpy as np
2| from causallearn.graph.GeneralGraph import GeneralGraph
3 from causallearn.graph.GraphNode import GraphNode
4 from causallearn.utils.DAG2CPDAG import dag2cpdag
5 from causallearn.score.LocalScoreFunction import local_score_BIC_from_cov
6 from causallearn.utils.GESUtils import Combinatorial, score_g
7 import random
8
9 # =====================================================================
10 # EDITABLE: implement run_causal_discovery below
11 # =====================================================================
12 def run_causal_discovery(X: np.ndarray) -> GeneralGraph:
13 """
14 Input: X of shape (n_samples, n_variables)
15 Output: estimated CPDAG as causallearn.grap
AssistantStep 4test
Tool ResultStep 4
[Test #1] (2 tests remaining; call submit(n=1) to submit this result) ### ER10 (scripts/eval_er10.sh) Results (ER10): SHD=0 AdjP=1.0000 AdjR=1.0000 ArrowP=1.0000 ArrowR=1.0000 ### ER20 (scripts/eval_er20.sh) Results (ER20): SHD=5 AdjP=0.9302 AdjR=0.9756 ArrowP=0.8919 ArrowR=0.9706 ### SF50 (scripts/eval_sf50.sh) Results (SF50): SHD=17 AdjP=0.9500 AdjR=0.9896 ArrowP=0.9688 ArrowR=0.8378 ### ER10-Hard (scripts/eval_er10_hard.sh) Results (ER10-Hard): SHD=18 AdjP=0.9200 AdjR=0.8846 ArrowP=0.4167 ArrowR=0.3846 ### ER20-Hard (scripts/eval_er20_hard.sh) Results (ER20-Hard): SHD=6 AdjP=0.9571 AdjR=0.9710 ArrowP=0.9420 ArrowR=0.9559 ### SF50-Hard (scripts/eval_sf50_hard.sh) Results (SF50-Hard): SHD=48 AdjP=0.8509 AdjR=0.9716 ArrowP=0.7931 ArrowR=0.8333 ### ER10-Noisy (scripts/eval_er10_noisy.sh) Results (ER10-Noisy): SHD=17 AdjP=0.9167 AdjR=0.8462 ArrowP=0.4783 ArrowR=0.4231 ### ER20-Noisy (scripts/eval_er20_noisy.sh) Results (ER20-Noisy): SHD=9 AdjP=0.929
AssistantStep 5edit
Tool Input
1{
2 "op": "replace",
3 "end_line": 170,
4 "content": "import numpy as np\nimport random\nfrom causallearn.graph.GeneralGraph import GeneralGraph\nfrom causallearn.graph.GraphNode import GraphNode\nfrom causallearn.utils.DAG2CPDAG import dag2cpdag\n\n# =====================================================================\n# EDITABLE: implement run_causal_discovery below\n# =====================================================================\ndef run_causal_discovery(X: np.ndarray) -> GeneralGraph:\n \"\"\"\n Input: X of shape (n_samples, n_variables)\n Output: estimated CPDAG as causallearn.graph.GeneralGraph.GeneralGraph\n \"\"\"\n n, d = X.shape\n nodes = [GraphNode(f\"X{i + 1}\") for i in range(d)]\n \n # 1. Compute empirical covariance and precision-based penalty matrix\n X_centered = X - np.mean(X, axis=0)\n cov = X_centered.T @ X_centered / n\n inv_cov = np.linalg.pinv(cov)\n \n R = np.zeros((d, d))\n for i in range(d):\n for j in range(d):\n R[i, j] = np.abs(inv_cov[i, j]) / (np.sqrt(inv_cov[i, i] * inv_cov[j, j]) + 1e-10)\n \n # Soft constraint penalty matrix\n # Relaxes BIC for strong partial correlations, penalizes hard for weak ones\n penalty_mat = np.log(n) * 2.0 * np.exp(-10.0 * R)\n ev_scale = max(1e-5, np.median(1.0 / np.clip(np.diag(inv_cov), 1e-10, None)))\n \n # Precompute subset score to avoid redundant matrix inversions\n _score_cache = {}\n def calc_score(i, P_tuple):\n if (i, P_tuple) in _score_cache:\n return _score_cache[(i, P_tuple)]\n \n if len(P_tuple) == 0:\n v = cov[i, i]\n if v < 1e-10: v = 1e-10\n base_pen = 0.0\n s = n * np.log(v) + n * 0.4 * (v / ev_scale) + base_pen\n _score_cache[(i, P_tuple)] = s\n return s\n \n P_list = list(P_tuple)\n cov_P = cov[np.ix_(P_list, P_list)]\n cov_iP = cov[i, P_list]\n try:\n inv_cov_P = np.linalg.pinv(cov_P)\n v = cov[i, i] - cov_iP @ inv_cov_P @ cov_iP\n except:\n v = cov[i, i]\n \n if v < 1e-10: v = 1e-10\n \n # Modified BIC: 0.5 * log(n) base penalty + precision penalty + EV assumption penalty\n # EV penalty uses n * 0.4 * (v / ev_scale) to break score equivalence and enforce exact fully directed DAG\n base_pen = len(P_tuple) * np.log(n) * 0.5\n prec_pen = sum(penalty_mat[i, p] for p in P_tuple)\n \n s = n * np.log(v) + n * 0.4 * (v / ev_scale) + base_pen + prec_pen\n _score_cache[(i, P_tuple)] = s\n return s\n \n # 3. Cache and Greedy Parent Search\n cache = {}\n def get_best_parents(i, predecessors):\n fs = frozenset(predecessors)\n if (i, fs) in cache:\n return cache[(i, fs)]\n \n current_P = []\n best_score = calc_score(i, tuple(sorted(current_P)))\n \n # Forward greedy\n while len(current_P) < min(15, len(predecessors)):\n best_next_p = None\n best_next_score = best_score\n \n for p in predecessors:\n if p not in current_P:\n cand_P = current_P + [p]\n score = calc_score(i, tuple(sorted(cand_P)))\n if score < best_next_score - 1e-5:\n best_next_score = score\n best_next_p = p\n \n if best_next_p is not None:\n current_P.append(best_next_p)\n best_score = best_next_score\n else:\n break\n \n # Backward elimination\n while len(current_P) > 0:\n best_rem_p = None\n best_rem_score = best_score\n for p in current_P:\n cand_P = [x for x in current_P if x != p]\n score = calc_score(i, tuple(sorted(cand_P)))\n if score < best_rem_score - 1e-5:\n best_rem_score = score\n best_rem_p = p\n if best_rem_p is not None:\n current_P.remove(best_rem_p)\n best_score = best_rem_score\n else:\n break\n \n res = (current_P, best_score)\n cache[(i, fs)] = res\n return res\n\n def get_node_score(i, preds_set):\n return get_best_parents(i, preds_set)[1]\n\n # Initialize order using marginal variances (exact for homoscedastic DAG)\n marginal_vars = np.diag(cov)\n current_order = np.argsort(marginal_vars).tolist()\n \n # Store dynamic predecessors\n preds_of = [set() for _ in range(d)]\n for idx, i in enumerate(current_order):\n preds_of[i] = set(current_order[:idx])\n \n current_scores = [get_node_score(i, preds_of[i]) for i in range(d)]\n current_total_score = sum(current_scores)\n \n # Hill Climbing over Orderings\n best_global_score = current_total_score\n best_global_order = current_order.copy()\n \n no_improvement = 0\n max_steps = 1500\n \n for step in range(max_steps):\n improved = False\n # Try adjacent swaps\n best_swap_diff = 1e-5\n best_swap_idx = None\n \n for k in range(d - 1):\n i = current_order[k]\n j = current_order[k+1]\n \n # If we swap i and j, i will have j in its predecessors (was not).\n # j will NOT have i in its predecessors (was).\n new_preds_i = preds_of[i] | {j}\n new_preds_j = preds_of[j] - {i}\n \n score_i_new = get_node_score(i, new_preds_i)\n score_j_new = get_node_score(j, new_preds_j)\n \n diff = (current_scores[i] + current_scores[j]) - (score_i_new + score_j_new)\n if diff > best_swap_diff:\n best_swap_diff = diff\n best_swap_idx = k\n \n if best_swap_idx is not None:\n k = best_swap_idx\n i = current_order[k]\n j = current_order[k+1]\n \n current_order[k], current_order[k+1] = j, i\n preds_of[i].add(j)\n preds_of[j].remove(i)\n \n current_scores[i] = get_node_score(i, preds_of[i])\n current_scores[j] = get_node_score(j, preds_of[j])\n current_total_score -= best_swap_diff\n improved = True\n else:\n # Shift moves\n best_shift_diff = 1e-5\n best_shift_move = None # (from_idx, to_idx)\n \n # Sub-sample shift combinations to save time\n shift_cands = random.sample(range(d), min(15, d))\n for f in shift_cands:\n for t in range(max(0, f-4), min(d, f+5)):\n if f == t: continue\n \n # Evaluate shift\n temp_order = current_order.copy()\n val = temp_order.pop(f)\n temp_order.insert(t, val)\n \n diff = 0\n min_idx = min(f, t)\n max_idx = max(f, t)\n \n temp_preds_set = set(temp_order[:min_idx])\n valid = True\n for k in range(min_idx, max_idx + 1):\n u = temp_order[k]\n sc = get_node_score(u, temp_preds_set)\n diff += current_scores[u] - sc\n temp_preds_set.add(u)\n \n if diff > best_shift_diff:\n best_shift_diff = diff\n best_shift_move = (f, t)\n \n if best_shift_move is not None:\n f, t = best_shift_move\n val = current_order.pop(f)\n current_order.insert(t, val)\n \n # Update preds and scores\n running_preds = set()\n for k in range(d):\n u = current_order[k]\n preds_of[u] = running_preds.copy()\n current_scores[u] = get_node_score(u, preds_of[u])\n running_preds.add(u)\n \n current_total_score = sum(current_scores)\n improved = True\n\n if current_total_score < best_global_score:\n best_global_score = current_total_score\n best_global_order = current_order.copy()\n no_improvement = 0\n else:\n if not improved:\n # Add some randomness (restart from a swapped order)\n idx1, idx2 = random.sample(range(d), 2)\n val = current_order.pop(idx1)\n current_order.insert(idx2, val)\n \n running_preds = set()\n for k in range(d):\n u = current_order[k]\n preds_of[u] = running_preds.copy()\n current_scores[u] = get_node_score(u, preds_of[u])\n running_preds.add(u)\n current_total_score = sum(current_scores)\n no_improvement += 1\n else:\n no_improvement += 1\n \n if no_improvement > 15:\n break\n\n # 5. Build Final DAG from best order\n graph = GeneralGraph(nodes)\n final_adjacency = np.zeros((d, d))\n \n current_preds = []\n for i in best_global_order:\n P, _ = get_best_parents(i, current_preds)\n for p in P:\n final_adjacency[p, i] = 1\n current_preds.append(i)\n \n for i in range(d):\n for j in range(d):\n if final_adjacency[i, j] == 1:\n graph.add_directed_edge(nodes[i], nodes[j])\n \n # 6. Convert to CPDAG\n cpdag = dag2cpdag(graph)\n return cpdag\n# =====================================================================",
5 "start_line": 12,
6 "filename": "causal-learn/bench/custom_algorithm.py"
7}Tool ResultStep 5
OK: Replaced lines 12..170 with 266 line(s) in causal-learn/bench/custom_algorithm.py. Editable range: 3–277.
[Current file: causal-learn/bench/custom_algorithm.py | editable: 3–277 | total: 278 lines]
1| import numpy as np
2| from causallearn.graph.GeneralGraph import GeneralGraph
3 from causallearn.graph.GraphNode import GraphNode
4 from causallearn.utils.DAG2CPDAG import dag2cpdag
5 from causallearn.score.LocalScoreFunction import local_score_BIC_from_cov
6 from causallearn.utils.GESUtils import Combinatorial, score_g
7 import random
8
9 # =====================================================================
10 # EDITABLE: implement run_causal_discovery below
11 # =====================================================================
12 import numpy as np
13 import random
14 from causallearn.graph.GeneralGraph import GeneralGraph
15 from causallearn.graph.GraphNode import GraphNode
16 from causal
AssistantStep 6test
Tool ResultStep 6
[Test #2] (1 test remaining; call submit(n=2) to submit this result) ### ER10 (scripts/eval_er10.sh) Results (ER10): SHD=0 AdjP=1.0000 AdjR=1.0000 ArrowP=1.0000 ArrowR=1.0000 ### ER20 (scripts/eval_er20.sh) Results (ER20): SHD=8 AdjP=0.8542 AdjR=1.0000 ArrowP=0.8293 ArrowR=1.0000 ### SF50 (scripts/eval_sf50.sh) Results (SF50): SHD=86 AdjP=0.5963 AdjR=1.0000 ArrowP=0.4275 ArrowR=0.7973 ### ER10-Hard (scripts/eval_er10_hard.sh) Results (ER10-Hard): SHD=4 AdjP=0.8667 AdjR=1.0000 ArrowP=0.8667 ArrowR=1.0000 ### ER20-Hard (scripts/eval_er20_hard.sh) Results (ER20-Hard): SHD=22 AdjP=0.7882 AdjR=0.9710 ArrowP=0.7647 ArrowR=0.9559 ### SF50-Hard (scripts/eval_sf50_hard.sh) Results (SF50-Hard): SHD=171 AdjP=0.4927 AdjR=0.9574 ArrowP=0.3911 ArrowR=0.7681 ### ER10-Noisy (scripts/eval_er10_noisy.sh) Results (ER10-Noisy): SHD=4 AdjP=0.8667 AdjR=1.0000 ArrowP=0.8667 ArrowR=1.0000 ### ER20-Noisy (scripts/eval_er20_noisy.sh) Results (ER20-Noisy): SHD=18 AdjP=0.827
AssistantStep 7edit
Tool Input
1{
2 "op": "replace",
3 "filename": "causal-learn/bench/custom_algorithm.py",
4 "content": "import numpy as np\nimport random\nfrom causallearn.graph.GeneralGraph import GeneralGraph\nfrom causallearn.graph.GraphNode import GraphNode\nfrom causallearn.utils.DAG2CPDAG import dag2cpdag\n\n# =====================================================================\n# EDITABLE: implement run_causal_discovery below\n# =====================================================================\ndef run_causal_discovery(X: np.ndarray) -> GeneralGraph:\n \"\"\"\n Input: X of shape (n_samples, n_variables)\n Output: estimated CPDAG as causallearn.graph.GeneralGraph.GeneralGraph\n \"\"\"\n n, d = X.shape\n nodes = [GraphNode(f\"X{i + 1}\") for i in range(d)]\n \n # 1. Compute empirical covariance and precision-based penalty matrix\n X_centered = X - np.mean(X, axis=0)\n cov = X_centered.T @ X_centered / n\n inv_cov = np.linalg.pinv(cov)\n \n R = np.zeros((d, d))\n for i in range(d):\n for j in range(d):\n R[i, j] = np.abs(inv_cov[i, j]) / (np.sqrt(inv_cov[i, i] * inv_cov[j, j]) + 1e-10)\n \n # Soft constraint penalty matrix\n penalty_mat = np.log(n) * 1.5 * np.exp(-12.0 * R)\n ev_scale = max(1e-5, np.median(1.0 / np.clip(np.diag(inv_cov), 1e-10, None)))\n \n _score_cache = {}\n def calc_score_for_parents(i, P_tuple):\n if (i, P_tuple) in _score_cache:\n return _score_cache[(i, P_tuple)]\n \n if len(P_tuple) == 0:\n v = cov[i, i]\n if v < 1e-10: v = 1e-10\n c_score = n * np.log(v)\n _score_cache[(i, P_tuple)] = (c_score, v)\n return c_score, v\n \n P_list = list(P_tuple)\n cov_P = cov[np.ix_(P_list, P_list)]\n cov_iP = cov[i, P_list]\n try:\n inv_cov_P = np.linalg.pinv(cov_P)\n v = cov[i, i] - cov_iP @ inv_cov_P @ cov_iP\n except:\n v = cov[i, i]\n \n if v < 1e-10: v = 1e-10\n \n # Standard BIC + precision penalty for parent selection\n base_pen = len(P_tuple) * np.log(n) * 0.55\n prec_pen = sum(penalty_mat[i, p] for p in P_tuple)\n \n c_score = n * np.log(v) + base_pen + prec_pen\n _score_cache[(i, P_tuple)] = (c_score, v)\n return c_score, v\n \n cache = {}\n def get_best_parents(i, predecessors):\n fs = frozenset(predecessors)\n if (i, fs) in cache:\n return cache[(i, fs)]\n \n current_P = []\n best_score, best_v = calc_score_for_parents(i, tuple(sorted(current_P)))\n \n # Forward greedy\n while len(current_P) < min(15, len(predecessors)):\n best_next_p = None\n best_next_score = best_score\n \n for p in predecessors:\n if p not in current_P:\n cand_P = current_P + [p]\n score, _ = calc_score_for_parents(i, tuple(sorted(cand_P)))\n if score < best_next_score - 1e-5:\n best_next_score = score\n best_next_p = p\n \n if best_next_p is not None:\n current_P.append(best_next_p)\n best_score = best_next_score\n else:\n break\n \n # Backward elimination\n while len(current_P) > 0:\n best_rem_p = None\n best_rem_score = best_score\n for p in current_P:\n cand_P = [x for x in current_P if x != p]\n score, _ = calc_score_for_parents(i, tuple(sorted(cand_P)))\n if score < best_rem_score - 1e-5:\n best_rem_score = score\n best_rem_p = p\n if best_rem_p is not None:\n current_P.remove(best_rem_p)\n best_score = best_rem_score\n else:\n break\n \n final_P_tuple = tuple(sorted(current_P))\n _, final_v = calc_score_for_parents(i, final_P_tuple)\n \n # Break score equivalence using the homoscedastic EV sum of variances\n order_score = best_score + n * 0.35 * (final_v / ev_scale)\n \n res = (current_P, order_score)\n cache[(i, fs)] = res\n return res\n\n def get_node_score(i, preds_set):\n return get_best_parents(i, preds_set)[1]\n\n marginal_vars = np.diag(cov)\n current_order = np.argsort(marginal_vars).tolist()\n \n preds_of = [set() for _ in range(d)]\n for idx, i in enumerate(current_order):\n preds_of[i] = set(current_order[:idx])\n \n current_scores = [get_node_score(i, preds_of[i]) for i in range(d)]\n current_total_score = sum(current_scores)\n \n best_global_score = current_total_score\n best_global_order = current_order.copy()\n \n no_improvement = 0\n max_steps = 1500\n \n for step in range(max_steps):\n improved = False\n best_swap_diff = 1e-5\n best_swap_idx = None\n \n for k in range(d - 1):\n i = current_order[k]\n j = current_order[k+1]\n \n new_preds_i = preds_of[i] | {j}\n new_preds_j = preds_of[i].copy()\n \n score_i_new = get_node_score(i, new_preds_i)\n score_j_new = get_node_score(j, new_preds_j)\n \n diff = (current_scores[i] + current_scores[j]) - (score_i_new + score_j_new)\n if diff > best_swap_diff:\n best_swap_diff = diff\n best_swap_idx = k\n \n if best_swap_idx is not None:\n k = best_swap_idx\n i = current_order[k]\n j = current_order[k+1]\n \n current_order[k], current_order[k+1] = j, i\n preds_of[i].add(j)\n preds_of[j].remove(i)\n \n current_scores[i] = get_node_score(i, preds_of[i])\n current_scores[j] = get_node_score(j, preds_of[j])\n current_total_score -= best_swap_diff\n improved = True\n else:\n best_shift_diff = 1e-5\n best_shift_move = None\n \n shift_cands = random.sample(range(d), min(20, d))\n for f in shift_cands:\n for t in range(max(0, f-5), min(d, f+6)):\n if f == t: continue\n \n temp_order = current_order.copy()\n val = temp_order.pop(f)\n temp_order.insert(t, val)\n \n diff = 0\n min_idx = min(f, t)\n max_idx = max(f, t)\n \n temp_preds_set = set(temp_order[:min_idx])\n valid = True\n for k in range(min_idx, max_idx + 1):\n u = temp_order[k]\n sc = get_node_score(u, temp_preds_set)\n diff += current_scores[u] - sc\n temp_preds_set.add(u)\n \n if diff > best_shift_diff:\n best_shift_diff = diff\n best_shift_move = (f, t)\n \n if best_shift_move is not None:\n f, t = best_shift_move\n val = current_order.pop(f)\n current_order.insert(t, val)\n \n running_preds = set()\n for k in range(d):\n u = current_order[k]\n preds_of[u] = running_preds.copy()\n current_scores[u] = get_node_score(u, preds_of[u])\n running_preds.add(u)\n \n current_total_score = sum(current_scores)\n improved = True\n\n if current_total_score < best_global_score - 1e-5:\n best_global_score = current_total_score\n best_global_order = current_order.copy()\n no_improvement = 0\n else:\n if not improved:\n idx1, idx2 = random.sample(range(d), 2)\n val = current_order.pop(idx1)\n current_order.insert(idx2, val)\n \n running_preds = set()\n for k in range(d):\n u = current_order[k]\n preds_of[u] = running_preds.copy()\n current_scores[u] = get_node_score(u, preds_of[u])\n running_preds.add(u)\n current_total_score = sum(current_scores)\n no_improvement += 1\n else:\n no_improvement += 1\n \n if no_improvement > 15:\n break\n\n graph = GeneralGraph(nodes)\n final_adjacency = np.zeros((d, d))\n \n current_preds = []\n for i in best_global_order:\n P, _ = get_best_parents(i, current_preds)\n for p in P:\n final_adjacency[p, i] = 1\n current_preds.append(i)\n \n for i in range(d):\n for j in range(d):\n if final_adjacency[i, j] == 1:\n graph.add_directed_edge(nodes[i], nodes[j])\n \n cpdag = dag2cpdag(graph)\n return cpdag\n# =====================================================================",
5 "end_line": 278,
6 "start_line": 12
7}Tool ResultStep 7
ERROR: Lines 12..278 exceed the editable range. You may only edit lines 3–277.
[Current file: causal-learn/bench/custom_algorithm.py | editable: 3–277 | total: 278 lines]
1| import numpy as np
2| from causallearn.graph.GeneralGraph import GeneralGraph
3 from causallearn.graph.GraphNode import GraphNode
4 from causallearn.utils.DAG2CPDAG import dag2cpdag
5 from causallearn.score.LocalScoreFunction import local_score_BIC_from_cov
6 from causallearn.utils.GESUtils import Combinatorial, score_g
7 import random
8
9 # =====================================================================
10 # EDITABLE: implement run_causal_discovery below
11 # =====================================================================
12 import numpy as np
13 import random
14 from causallearn.graph.GeneralGraph import GeneralGraph
15 from causallearn.graph.GraphNode import GraphNode
16 from causallearn.utils.DAG2CPDAG import da