Agent Conversation: google/gemini-3.1-pro-preview
Task: causal-discovery-discrete
Causal Discovery on Real-World Bayesian Network Datasets (bnlearn)
Research Question
Design a causal discovery algorithm that recovers the CPDAG (Completed Partially Directed Acyclic Graph) from purely observational discrete data sampled from real-world Bayesian networks in the bnlearn repository.
Background
The bnlearn repository (https://www.bnlearn.com/bnrepository/) hosts a collection of well-known Bayesian network benchmarks from diverse domains (medicine, biology, meteorology, insurance, agriculture, IT). Each network has a known ground-truth DAG with discrete variables and conditional probability tables. Given observational samples from these networks, the task is to recover the causal structure.
Under the faithfulness assumption, observational data can identify the Markov Equivalence Class (MEC) of the true DAG, represented by a CPDAG. The challenge lies in handling discrete data with varying cardinalities, network sizes (5–76 nodes), and edge densities.
Task
Implement a causal discovery algorithm in bench/custom_algorithm.py. Your run_causal_discovery(X) function receives integer-encoded discrete observational data and must return the estimated CPDAG as a causallearn.graph.GeneralGraph.GeneralGraph object.
Interface
def run_causal_discovery(X: np.ndarray) -> GeneralGraph:
"""
Input: X of shape (n_samples, n_variables), integer-encoded discrete data
Output: estimated CPDAG as causallearn.graph.GeneralGraph.GeneralGraph
"""
Evaluation Scenarios
Small Networks (<20 nodes)
| Label | Network | Nodes | Edges | Samples | Domain |
|---|---|---|---|---|---|
| Cancer | Cancer | 5 | 4 | 500 | Medical |
| Earthquake | Earthquake | 5 | 4 | 500 | Seismology |
| Survey | Survey | 6 | 6 | 500 | Social science |
| Asia | Asia | 8 | 8 | 1000 | Medical (lung diseases) |
| Sachs | Sachs | 11 | 17 | 1000 | Biology (protein signaling) |
Medium Networks (20–50 nodes)
| Label | Network | Nodes | Edges | Samples | Domain |
|---|---|---|---|---|---|
| Child | Child | 20 | 25 | 2000 | Medical |
| Insurance | Insurance | 27 | 52 | 5000 | Automotive insurance |
| Water | Water | 32 | 66 | 5000 | Water treatment |
| Mildew | Mildew | 35 | 46 | 5000 | Agriculture (crop disease) |
| Alarm | Alarm | 37 | 46 | 5000 | Medical monitoring |
| Barley | Barley | 48 | 84 | 10000 | Agriculture |
Large Networks (50–100 nodes)
| Label | Network | Nodes | Edges | Samples | Domain |
|---|---|---|---|---|---|
| Hailfinder | Hailfinder | 56 | 66 | 10000 | Meteorology |
| Hepar2 | Hepar2 | 70 | 123 | 10000 | Medical (liver disorders) |
| Win95pts | Win95pts | 76 | 112 | 10000 | IT (Windows troubleshooting) |
Metrics
Metrics are computed between estimated CPDAG and ground-truth CPDAG (converted from the true DAG via dag2cpdag):
- SHD (Structural Hamming Distance): total edge err
Tool Input
1{
2 "filename": "bench/custom_algorithm.py",
3 "op": "replace",
4 "end_line": 14,
5 "content": "def run_causal_discovery(X: np.ndarray) -> GeneralGraph:\n \"\"\"\n Input: X of shape (n_samples, n_variables), integer-encoded discrete data\n Output: estimated CPDAG as causallearn.graph.GeneralGraph.GeneralGraph\n \"\"\"\n from causallearn.search.ScoreBased.GES import ges\n # Print out ges arguments\n import inspect\n print(\"\\nGES signature:\", inspect.signature(ges))\n \n import causallearn.score.LocalScoreFunction as LSF\n print(\"\\nLocalScoreFunction dir:\", dir(LSF))\n\n from causallearn.utils.PCUtils import SkeletonDiscovery\n print(\"\\nSkeletonDiscovery signature:\", inspect.signature(SkeletonDiscovery))\n\n record = ges(X, score_func=\"local_score_BDeu\")\n return record['G']",
6 "start_line": 8
7}ERROR: Package 'bench' is not in allowed packages
Tool Input
1{
2 "content": "def run_causal_discovery(X: np.ndarray) -> GeneralGraph:\n \"\"\"\n Input: X of shape (n_samples, n_variables), integer-encoded discrete data\n Output: estimated CPDAG as causallearn.graph.GeneralGraph.GeneralGraph\n \"\"\"\n from causallearn.search.ScoreBased.GES import ges\n # Print out ges arguments\n import inspect\n print(\"\\nGES signature:\", inspect.signature(ges))\n \n import causallearn.score.LocalScoreFunction as LSF\n print(\"\\nLocalScoreFunction dir:\", dir(LSF))\n\n from causallearn.utils.PCUtils import SkeletonDiscovery\n print(\"\\nSkeletonDiscovery signature:\", inspect.signature(SkeletonDiscovery))\n\n record = ges(X, score_func=\"local_score_BDeu\")\n return record['G']",
3 "filename": "causal-bnlearn/bench/custom_algorithm.py",
4 "op": "replace",
5 "end_line": 14,
6 "start_line": 8
7}OK: Replaced lines 8..14 with 18 line(s) in causal-bnlearn/bench/custom_algorithm.py. Editable range: 3–25.
[Current file: causal-bnlearn/bench/custom_algorithm.py | editable: 3–25 | total: 26 lines]
1| import numpy as np
2| from causallearn.graph.GeneralGraph import GeneralGraph
3 from causallearn.graph.GraphNode import GraphNode
4
5 # =====================================================================
6 # EDITABLE: implement run_causal_discovery below
7 # =====================================================================
8 def run_causal_discovery(X: np.ndarray) -> GeneralGraph:
9 """
10 Input: X of shape (n_samples, n_variables), integer-encoded discrete data
11 Output: estimated CPDAG as causallearn.graph.GeneralGraph.GeneralGraph
12 """
13 from causallearn.search.ScoreBased.GES import ges
14 # Print out ges arguments
15 import inspect
16 print("\nGES signa
Tool Input
1{
2 "start_line": 3,
3 "content": "from causallearn.graph.GraphNode import GraphNode\n\ndef run_causal_discovery(X: np.ndarray) -> GeneralGraph:\n import numpy as np\n from itertools import combinations\n import random\n from causallearn.graph.Edge import Edge\n from causallearn.graph.Endpoint import Endpoint\n from causallearn.search.ConstraintBased.PC import pc\n from causallearn.utils.cit import CIT\n from causallearn.score.LocalScoreFunctionClass import LocalScoreClass\n from causallearn.score.LocalScoreFunction import local_score_BDeu\n from causallearn.utils.DAG2CPDAG import dag2cpdag\n\n N = X.shape[1]\n indep_test = CIT(X, \"chisq\")\n cg = pc(X, 0.05, indep_test, node_names=[f\"X{i+1}\" for i in range(N)])\n \n adj = cg.G.graph\n neighbors = [set() for _ in range(N)]\n for i in range(N):\n for j in range(N):\n if i != j and adj[i, j] != 0:\n neighbors[i].add(j)\n neighbors[j].add(i)\n \n score_func = LocalScoreClass(X, local_score_BDeu)\n subset_best_cache = [dict() for _ in range(N)]\n raw_score_cache = [dict() for _ in range(N)]\n \n def get_score(node, pa_list):\n pa_tuple = tuple(sorted(pa_list))\n if pa_tuple not in raw_score_cache[node]:\n raw_score_cache[node][pa_tuple] = score_func.score(node, list(pa_tuple))\n return raw_score_cache[node][pa_tuple]\n\n def get_best_score_and_parents(node, candidates_set):\n candidates_tuple = frozenset(candidates_set)\n if candidates_tuple in subset_best_cache[node]:\n return subset_best_cache[node][candidates_tuple]\n \n best_s = -float('inf')\n best_p = tuple()\n max_pa = min(len(candidates_tuple), 8) # limit max in-degree to avoid exponential blowup\n \n for k in range(max_pa + 1):\n for pa in combinations(candidates_tuple, k):\n s = get_score(node, list(pa))\n if s > best_s:\n best_s = s\n best_p = pa\n \n subset_best_cache[node][candidates_tuple] = (best_s, best_p)\n return best_s, best_p\n\n def score_order(order):\n total_score = 0.0\n predecessors = set()\n for node in order:\n candidates = neighbors[node] & predecessors\n s, _ = get_best_score_and_parents(node, candidates)\n total_score += s\n predecessors.add(node)\n return total_score\n\n best_overall_score = -float('inf')\n best_overall_order = None\n NUM_RESTARTS = 3\n random.seed(42)\n \n for restart_idx in range(NUM_RESTARTS):\n current_order = list(range(N))\n if restart_idx > 0:\n random.shuffle(current_order)\n \n current_score = score_order(current_order)\n improved = True\n \n while improved:\n improved = False\n best_s = current_score\n best_o = None\n \n for i in range(N):\n for j in range(N):\n if i == j: continue\n new_order = current_order.copy()\n node = new_order.pop(i)\n new_order.insert(j, node)\n \n ns = score_order(new_order)\n if ns > best_s:\n best_s = ns\n best_o = new_order\n \n if best_o is not None:\n current_score = best_s\n current_order = best_o\n improved = True\n \n if current_score > best_overall_score:\n best_overall_score = current_score\n best_overall_order = current_order\n \n dag_matrix = np.zeros((N, N), dtype=int)\n predecessors = set()\n for node in best_overall_order:\n candidates = neighbors[node] & predecessors\n _, best_p = get_best_score_and_parents(node, candidates)\n for p in best_p:\n dag_matrix[p, node] = 1\n predecessors.add(node)\n \n nodes = [GraphNode(f\"X{i + 1}\") for i in range(N)]\n G = GeneralGraph(nodes)\n for i in range(N):\n for j in range(N):\n if dag_matrix[i, j] == 1:\n G.add_edge(Edge(nodes[i], nodes[j], Endpoint.TAIL, Endpoint.ARROW))\n \n return dag2cpdag(G)",
4 "end_line": 14,
5 "filename": "bench/custom_algorithm.py",
6 "op": "replace"
7}ERROR: Package 'bench' is not in allowed packages
Tool Input
1{
2 "start_line": 3,
3 "filename": "causal-bnlearn/bench/custom_algorithm.py",
4 "content": "from causallearn.graph.GraphNode import GraphNode\n\ndef run_causal_discovery(X: np.ndarray) -> GeneralGraph:\n import numpy as np\n from itertools import combinations\n import random\n from causallearn.graph.Edge import Edge\n from causallearn.graph.Endpoint import Endpoint\n from causallearn.search.ConstraintBased.PC import pc\n from causallearn.utils.cit import CIT\n from causallearn.score.LocalScoreFunctionClass import LocalScoreClass\n from causallearn.score.LocalScoreFunction import local_score_BDeu\n from causallearn.utils.DAG2CPDAG import dag2cpdag\n\n N = X.shape[1]\n indep_test = CIT(X, \"chisq\")\n cg = pc(X, 0.05, indep_test, node_names=[f\"X{i+1}\" for i in range(N)])\n \n adj = cg.G.graph\n neighbors = [set() for _ in range(N)]\n for i in range(N):\n for j in range(N):\n if i != j and adj[i, j] != 0:\n neighbors[i].add(j)\n neighbors[j].add(i)\n \n score_func = LocalScoreClass(X, local_score_BDeu)\n subset_best_cache = [dict() for _ in range(N)]\n raw_score_cache = [dict() for _ in range(N)]\n \n def get_score(node, pa_list):\n pa_tuple = tuple(sorted(pa_list))\n if pa_tuple not in raw_score_cache[node]:\n raw_score_cache[node][pa_tuple] = score_func.score(node, list(pa_tuple))\n return raw_score_cache[node][pa_tuple]\n\n def get_best_score_and_parents(node, candidates_set):\n candidates_tuple = frozenset(candidates_set)\n if candidates_tuple in subset_best_cache[node]:\n return subset_best_cache[node][candidates_tuple]\n \n best_s = -float('inf')\n best_p = tuple()\n max_pa = min(len(candidates_tuple), 8)\n \n for k in range(max_pa + 1):\n for pa in combinations(candidates_tuple, k):\n s = get_score(node, list(pa))\n if s > best_s:\n best_s = s\n best_p = pa\n \n subset_best_cache[node][candidates_tuple] = (best_s, best_p)\n return best_s, best_p\n\n def score_order(order):\n total_score = 0.0\n predecessors = set()\n for node in order:\n candidates = neighbors[node] & predecessors\n s, _ = get_best_score_and_parents(node, candidates)\n total_score += s\n predecessors.add(node)\n return total_score\n\n best_overall_score = -float('inf')\n best_overall_order = None\n NUM_RESTARTS = 3\n random.seed(42)\n \n for restart_idx in range(NUM_RESTARTS):\n current_order = list(range(N))\n if restart_idx > 0:\n random.shuffle(current_order)\n \n current_score = score_order(current_order)\n improved = True\n \n while improved:\n improved = False\n best_s = current_score\n best_o = None\n \n for i in range(N):\n for j in range(N):\n if i == j: continue\n new_order = current_order.copy()\n node = new_order.pop(i)\n new_order.insert(j, node)\n \n ns = score_order(new_order)\n if ns > best_s:\n best_s = ns\n best_o = new_order\n \n if best_o is not None:\n current_score = best_s\n current_order = best_o\n improved = True\n \n if current_score > best_overall_score:\n best_overall_score = current_score\n best_overall_order = current_order\n \n dag_matrix = np.zeros((N, N), dtype=int)\n predecessors = set()\n for node in best_overall_order:\n candidates = neighbors[node] & predecessors\n _, best_p = get_best_score_and_parents(node, candidates)\n for p in best_p:\n dag_matrix[p, node] = 1\n predecessors.add(node)\n \n nodes = [GraphNode(f\"X{i + 1}\") for i in range(N)]\n G = GeneralGraph(nodes)\n for i in range(N):\n for j in range(N):\n if dag_matrix[i, j] == 1:\n G.add_edge(Edge(nodes[i], nodes[j], Endpoint.TAIL, Endpoint.ARROW))\n \n return dag2cpdag(G)",
5 "op": "replace",
6 "end_line": 14
7}OK: Replaced lines 3..14 with 121 line(s) in causal-bnlearn/bench/custom_algorithm.py. Editable range: 3–123.
[Current file: causal-bnlearn/bench/custom_algorithm.py | editable: 3–123 | total: 124 lines]
1| import numpy as np
2| from causallearn.graph.GeneralGraph import GeneralGraph
3 from causallearn.graph.GraphNode import GraphNode
4
5 def run_causal_discovery(X: np.ndarray) -> GeneralGraph:
6 import numpy as np
7 from itertools import combinations
8 import random
9 from causallearn.graph.Edge import Edge
10 from causallearn.graph.Endpoint import Endpoint
11 from causallearn.search.ConstraintBased.PC import pc
12 from causallearn.utils.cit import CIT
13 from causallearn.score.LocalScoreFunctionClass import LocalScoreClass
14 from causallearn.score.LocalScoreFunction import local_score_BDeu
15 from causallearn.utils.DAG2CPDAG import dag2cpdag
16
17
[Test #1] (2 tests remaining; call submit(n=1) to submit this result) ### Cancer (scripts/eval_cancer.sh) /home/bohanlyu/miniconda3/envs/mlsbench-causal-bnlearn/lib/python3.10/site-packages/pgmpy/estimators/__init__.py:4: FutureWarning: `pgmpy.estimators.StructureScore` is deprecated and will be removed in a future release. Use `pgmpy.structure_score` instead. from .StructureScore import ( /srv/home/bohanlyu/MLS-Bench/vendor/workspace/causal-discovery-discrete/google_gemini-3.1-pro-preview_20260402_183953/causal-bnlearn/bench/data_gen.py:45: FutureWarning: `get_example_model` is deprecated. Please use `pgmpy.example_models.load_model` instead. model = get_example_model(network_name) 0%| | 0/5 [00:00<?, ?it/s] Generating for node: Pollution: 0%| | 0/5 [00:00<?, ?it/s] Generating for node: Pollution: 20%|██ | 1/5 [00:00<00:00, 4.41it/s] Generating for node: Smoker: 20%|██ | 1/5 [00:00<00:00, 4.41it/s] Generating for node: Cancer: 20%|██
Tool Input
1{
2 "filename": "causal-bnlearn/bench/custom_algorithm.py",
3 "content": " N = X.shape[1]\n cg = pc(X, 0.05, \"chisq\", node_names=[f\"X{i+1}\" for i in range(N)])\n \n adj = cg.G.graph",
4 "end_line": 21,
5 "op": "replace",
6 "start_line": 17
7}OK: Replaced lines 17..21 with 4 line(s) in causal-bnlearn/bench/custom_algorithm.py. Editable range: 3–122.
[Current file: causal-bnlearn/bench/custom_algorithm.py | editable: 3–122 | total: 123 lines]
1| import numpy as np
2| from causallearn.graph.GeneralGraph import GeneralGraph
3 from causallearn.graph.GraphNode import GraphNode
4
5 def run_causal_discovery(X: np.ndarray) -> GeneralGraph:
6 import numpy as np
7 from itertools import combinations
8 import random
9 from causallearn.graph.Edge import Edge
10 from causallearn.graph.Endpoint import Endpoint
11 from causallearn.search.ConstraintBased.PC import pc
12 from causallearn.utils.cit import CIT
13 from causallearn.score.LocalScoreFunctionClass import LocalScoreClass
14 from causallearn.score.LocalScoreFunction import local_score_BDeu
15 from causallearn.utils.DAG2CPDAG import dag2cpdag
16
17
[Test #2] (1 test remaining; call submit(n=2) to submit this result) ### Cancer (scripts/eval_cancer.sh) Results (Cancer): SHD=4 AdjP=0.0000 AdjR=0.0000 ArrowP=0.0000 ArrowR=0.0000 ### Earthquake (scripts/eval_earthquake.sh) Results (Earthquake): SHD=4 AdjP=0.0000 AdjR=0.0000 ArrowP=0.0000 ArrowR=0.0000 ### Survey (scripts/eval_survey.sh) Results (Survey): SHD=7 AdjP=0.6667 AdjR=0.3333 ArrowP=0.0000 ArrowR=0.0000 ### Asia (scripts/eval_asia.sh) Results (Asia): SHD=8 AdjP=0.0000 AdjR=0.0000 ArrowP=0.0000 ArrowR=0.0000 ### Sachs (scripts/eval_sachs.sh) Results (Sachs): SHD=17 AdjP=0.0000 AdjR=0.0000 ArrowP=0.0000 ArrowR=0.0000 ### Child (scripts/eval_child.sh) Results (Child): SHD=25 AdjP=1.0000 AdjR=0.0400 ArrowP=0.0000 ArrowR=0.0000 ### Insurance (scripts/eval_insurance.sh) Results (Insurance): SHD=52 AdjP=0.0000 AdjR=0.0000 ArrowP=0.0000 ArrowR=0.0000 ### Water (scripts/eval_water.sh) Results (Water): SHD=67 AdjP=0.0000 AdjR=0.0000 ArrowP=0.0000
Tool Input
1{
2 "start_line": 3,
3 "op": "replace",
4 "end_line": 122,
5 "filename": "causal-bnlearn/bench/custom_algorithm.py",
6 "content": "from causallearn.graph.GraphNode import GraphNode\n\ndef run_causal_discovery(X: np.ndarray) -> GeneralGraph:\n import numpy as np\n import random\n from itertools import combinations\n from causallearn.graph.Edge import Edge\n from causallearn.graph.Endpoint import Endpoint\n from causallearn.score.LocalScoreFunctionClass import LocalScoreClass\n from causallearn.score.LocalScoreFunction import local_score_BDeu\n from causallearn.utils.DAG2CPDAG import dag2cpdag\n\n N = X.shape[1]\n \n MI_matrix = np.zeros((N, N))\n for i in range(N):\n for j in range(N):\n if i >= j: continue\n bins_i = np.arange(int(np.max(X[:, i])) + 2)\n bins_j = np.arange(int(np.max(X[:, j])) + 2)\n p_ij, _, _ = np.histogram2d(X[:, i], X[:, j], bins=[bins_i, bins_j])\n p_ij = p_ij / p_ij.sum()\n nz = p_ij > 0\n p_i = np.sum(p_ij, axis=1, keepdims=True)\n p_j = np.sum(p_ij, axis=0, keepdims=True)\n denom = (p_i * p_j)[nz]\n mi = np.sum(p_ij[nz] * np.log(p_ij[nz] / denom))\n MI_matrix[i, j] = mi\n MI_matrix[j, i] = mi\n\n K = min(N - 1, 15)\n top_neighbors = []\n for i in range(N):\n idx = np.argsort(MI_matrix[i, :])[-K:]\n s = set(idx)\n s.discard(i)\n top_neighbors.append(s)\n\n score_func = LocalScoreClass(X, local_score_BDeu)\n subset_best_cache = [dict() for _ in range(N)]\n raw_score_cache = [dict() for _ in range(N)]\n \n def get_score(node, pa_list):\n pa_tuple = tuple(sorted(pa_list))\n if pa_tuple not in raw_score_cache[node]:\n raw_score_cache[node][pa_tuple] = score_func.score(node, list(pa_tuple))\n return raw_score_cache[node][pa_tuple]\n\n def get_best_score_and_parents(node, candidates_set):\n candidates_tuple = frozenset(candidates_set)\n if candidates_tuple in subset_best_cache[node]:\n return subset_best_cache[node][candidates_tuple]\n \n best_s = -float('inf')\n best_p = tuple()\n max_pa = min(len(candidates_tuple), 4)\n \n for k in range(max_pa + 1):\n for pa in combinations(candidates_tuple, k):\n s = get_score(node, list(pa))\n if s > best_s:\n best_s = s\n best_p = pa\n \n subset_best_cache[node][candidates_tuple] = (best_s, best_p)\n return best_s, best_p\n\n def score_order(order):\n total_score = 0.0\n predecessors = set()\n for node in order:\n candidates = top_neighbors[node] & predecessors\n s, _ = get_best_score_and_parents(node, candidates)\n total_score += s\n predecessors.add(node)\n return total_score\n\n best_overall_score = -float('inf')\n best_overall_order = None\n NUM_RESTARTS = 6\n random.seed(42)\n \n for restart_idx in range(NUM_RESTARTS):\n current_order = list(range(N))\n random.shuffle(current_order)\n \n current_score = score_order(current_order)\n improved = True\n \n while improved:\n improved = False\n best_s = current_score\n best_o = None\n \n for i in range(N):\n for j in range(N):\n if i == j: continue\n new_order = current_order.copy()\n node = new_order.pop(i)\n new_order.insert(j, node)\n \n ns = score_order(new_order)\n if ns > best_s:\n best_s = ns\n best_o = new_order\n \n if best_o is not None:\n current_score = best_s\n current_order = best_o\n improved = True\n \n if current_score > best_overall_score:\n best_overall_score = current_score\n best_overall_order = current_order\n \n dag_matrix = np.zeros((N, N), dtype=int)\n predecessors = set()\n for node in best_overall_order:\n candidates = top_neighbors[node] & predecessors\n _, best_p = get_best_score_and_parents(node, candidates)\n for p in best_p:\n dag_matrix[p, node] = 1\n predecessors.add(node)\n \n def has_path(start, end, adj):\n visited = set([start])\n stack = [start]\n while stack:\n curr = stack.pop()\n if curr == end: return True\n for child in range(N):\n if adj[curr, child] == 1 and child not in visited:\n visited.add(child)\n stack.append(child)\n return False\n \n improved_dag = True\n while improved_dag:\n improved_dag = False\n best_delta = 1e-5\n best_op = None\n \n for i in range(N):\n for j in range(N):\n if i == j: continue\n if dag_matrix[i, j] == 1:\n pa_j = [p for p in range(N) if dag_matrix[p, j] == 1 and p != i]\n s_del = get_score(j, pa_j) - get_score(j, pa_j + [i])\n if s_del > best_delta:\n best_delta = s_del; best_op = ('del', i, j)\n \n dag_matrix[i, j] = 0\n if not has_path(i, j, dag_matrix):\n pa_i = [p for p in range(N) if dag_matrix[p, i] == 1]\n s_rev = (get_score(i, pa_i + [j]) + get_score(j, pa_j)) - \\\n (get_score(i, pa_i) + get_score(j, pa_j + [i]))\n if s_rev > best_delta:\n best_delta = s_rev; best_op = ('rev', i, j)\n dag_matrix[i, j] = 1\n \n elif dag_matrix[j, i] == 0:\n if not has_path(j, i, dag_matrix):\n pa_j = [p for p in range(N) if dag_matrix[p, j] == 1]\n s_add = get_score(j, pa_j + [i]) - get_score(j, pa_j)\n if s_add > best_delta:\n best_delta = s_add; best_op = ('add', i, j)\n \n if best_op is not None:\n op, i, j = best_op\n if op == 'add': dag_matrix[i, j] = 1\n elif op == 'del': dag_matrix[i, j] = 0\n elif op == 'rev':\n dag_matrix[i, j] = 0\n dag_matrix[j, i] = 1\n improved_dag = True\n\n nodes = [GraphNode(f\"X{i + 1}\") for i in range(N)]\n G = GeneralGraph(nodes)\n for i in range(N):\n for j in range(N):\n if dag_matrix[i, j] == 1:\n G.add_edge(Edge(nodes[i], nodes[j], Endpoint.TAIL, Endpoint.ARROW))\n \n return dag2cpdag(G)"
7}OK: Replaced lines 3..122 with 184 line(s) in causal-bnlearn/bench/custom_algorithm.py. Editable range: 3–186.
[Current file: causal-bnlearn/bench/custom_algorithm.py | editable: 3–186 | total: 187 lines]
1| import numpy as np
2| from causallearn.graph.GeneralGraph import GeneralGraph
3 from causallearn.graph.GraphNode import GraphNode
4
5 def run_causal_discovery(X: np.ndarray) -> GeneralGraph:
6 import numpy as np
7 import random
8 from itertools import combinations
9 from causallearn.graph.Edge import Edge
10 from causallearn.graph.Endpoint import Endpoint
11 from causallearn.score.LocalScoreFunctionClass import LocalScoreClass
12 from causallearn.score.LocalScoreFunction import local_score_BDeu
13 from causallearn.utils.DAG2CPDAG import dag2cpdag
14
15 N = X.shape[1]
16
17 MI_matrix = np.zeros((N, N))
18 for i in range(N):
19
[Test #3] (0 tests remaining) ### Cancer (scripts/eval_cancer.sh) Results (Cancer): SHD=9 AdjP=0.2500 AdjR=0.5000 ArrowP=0.1667 ArrowR=0.2500 ### Earthquake (scripts/eval_earthquake.sh) Results (Earthquake): SHD=7 AdjP=0.0000 AdjR=0.0000 ArrowP=0.0000 ArrowR=0.0000 ### Survey (scripts/eval_survey.sh) Results (Survey): SHD=13 AdjP=0.3571 AdjR=0.8333 ArrowP=0.2500 ArrowR=0.3333 ### Asia (scripts/eval_asia.sh) Results (Asia): SHD=21 AdjP=0.0714 AdjR=0.1250 ArrowP=0.0000 ArrowR=0.0000 ### Sachs (scripts/eval_sachs.sh) Results (Sachs): SHD=49 AdjP=0.1905 AdjR=0.4706 ArrowP=0.0000 ArrowR=0.0000 ### Child (scripts/eval_child.sh) Results (Child): SHD=151 AdjP=0.0382 AdjR=0.2000 ArrowP=0.0000 ArrowR=0.0000 ### Insurance (scripts/eval_insurance.sh) Results (Insurance): SHD=281 AdjP=0.0853 AdjR=0.4231 ArrowP=0.0198 ArrowR=0.1471 ### Water (scripts/eval_water.sh) Results (Water): SHD=455 AdjP=0.0950 AdjR=0.6364 ArrowP=0.0271 ArrowR=0.1833 ### Mildew (script