Discrete Causal Graph Discovery

Studies how causal discovery algorithms recover equivalence-class graph structure from discrete observational data.

Structured & Causal Reasoningcausal-bnlearn

causal-discovery-discrete

Description

Causal Discovery on Discrete Bayesian Network Datasets (bnlearn)

Research Question

Design a causal discovery algorithm that recovers the CPDAG (Completed Partially Directed Acyclic Graph) from purely observational, integer-coded discrete data sampled from real-world Bayesian networks in the bnlearn repository.

Background

The bnlearn repository (https://www.bnlearn.com/bnrepository/) hosts a collection of well-known Bayesian network benchmarks from diverse domains (medicine, biology, meteorology, insurance, agriculture, IT). Each network has a known ground-truth DAG with discrete variables and conditional probability tables.

Under the faithfulness assumption, observational data can identify only the Markov Equivalence Class (MEC) of the true DAG, represented by a CPDAG. The challenge lies in handling discrete data with varying cardinalities, network sizes (small to >70 nodes), and edge densities, without over-specializing to a single scale or cardinality pattern.

Task

Implement a causal discovery algorithm in bench/custom_algorithm.py. The run_causal_discovery(X) function receives integer-encoded discrete observational data and must return the estimated CPDAG as a causallearn.graph.GeneralGraph.GeneralGraph object.

def run_causal_discovery(X: np.ndarray) -> GeneralGraph:
    """
    Input:  X of shape (n_samples, n_variables), integer-encoded discrete data
    Output: estimated CPDAG as causallearn.graph.GeneralGraph.GeneralGraph
    """

Evaluation Networks

Label	Nodes	Edges	Domain
Cancer	5	4	Medical
Child	20	25	Medical
Alarm	37	46	Medical monitoring
Hailfinder	56	66	Meteorology
Win95pts	76	112	IT (Windows troubleshooting)

Each network is sampled with a fixed observational sample size; the agent must generalize across small/medium/large networks and across different cardinality patterns.

Metrics

Metrics are computed between the estimated CPDAG and the ground-truth CPDAG (converted from the true DAG via dag2cpdag):

SHD (Structural Hamming Distance): total edge errors (lower is better)
Adjacency Precision / Recall: skeleton recovery quality (higher is better)
Arrow Precision / Recall: edge orientation accuracy (higher is better)

Reference baselines

The benchmark ships several classical baselines for comparison. Citations are provided so the agent can study the prior art; default hyperparameters are the ones recommended in the cited papers (e.g., chi-squared CI test for PC, BDeu score for the score-based methods).

pc: Peter-Clark algorithm with chi-squared CI test. Constraint-based. Spirtes, Glymour & Scheines, Causation, Prediction, and Search (MIT Press, 2nd ed., 2000).
ges: Greedy Equivalence Search with BDeu score. Score-based. Chickering, "Optimal Structure Identification With Greedy Search," JMLR 3, 2002.
grasp: Greedy Relaxations of the Sparsest Permutation with BDeu score. Permutation-based. Lam, Andrews & Ramsey, "Greedy Relaxations of the Sparsest Permutation Algorithm," UAI 2022 (arXiv:2206.05421).
boss: Best Order Score Search with BDeu score. Permutation-based. Andrews et al., "Fast Scalable and Accurate Discovery of DAGs Using the Best Order Score Search and Grow-Shrink Trees," NeurIPS 2023 (arXiv:2310.17679).
hc: Hill-Climbing search with BDeu score. Score-based, classical local search baseline.

The contribution should be a modular causal discovery procedure for discrete observational data, such as a constraint-based, score-based, permutation-based, hybrid, or otherwise principled alternative, while staying within the provided causal graph interface.

Code

custom_algorithm.py

EditableRead-only

1import numpy as np
2from causallearn.graph.GeneralGraph import GeneralGraph
3from causallearn.graph.GraphNode import GraphNode
4
5# =====================================================================
6# EDITABLE: implement run_causal_discovery below
7# =====================================================================
8def run_causal_discovery(X: np.ndarray) -> GeneralGraph:
9    """
10    Input:  X of shape (n_samples, n_variables), integer-encoded discrete data
11    Output: estimated CPDAG as causallearn.graph.GeneralGraph.GeneralGraph
12    """
13    nodes = [GraphNode(f"X{i + 1}") for i in range(X.shape[1])]
14    return GeneralGraph(nodes)
15# =====================================================================

run_eval.py

EditableRead-only

1"""Evaluation harness for the causal-discovery-discrete task."""
2import argparse
3import os
4import sys
5
6sys.path.insert(0, os.path.dirname(os.path.abspath(__file__)))
7
8from custom_algorithm import run_causal_discovery
9from data_gen import load_and_sample
10from metrics import compute_metrics
11
12
13def main():
14    parser = argparse.ArgumentParser(
15        description="Evaluate CPDAG recovery on bnlearn discrete data."

data_gen.py

EditableRead-only

1"""Data loader for bnlearn Bayesian network benchmarks.
2
3Uses pgmpy's bundled BIF files (no network access needed) to load
4real-world Bayesian networks, sample discrete observational data,
5and extract the ground-truth DAG.
6"""
7import numpy as np
8import pandas as pd
9
10from causallearn.graph.Dag import Dag
11from causallearn.graph.GraphNode import GraphNode
12
13# All discrete bnlearn networks supported by pgmpy
14SUPPORTED_NETWORKS = [
15    "cancer", "earthquake", "survey", "asia", "sachs",

metrics.py

EditableRead-only

1"""Evaluation metrics for CPDAG recovery on bnlearn discrete data."""
2from causallearn.graph.AdjacencyConfusion import AdjacencyConfusion
3from causallearn.graph.ArrowConfusion import ArrowConfusion
4from causallearn.graph.Graph import Graph
5from causallearn.graph.SHD import SHD
6from causallearn.utils.DAG2CPDAG import dag2cpdag
7
8
9def _safe_div(numerator, denominator):
10    return numerator / denominator if denominator > 0 else 0.0
11
12
13def _normalize_graph_output(graph_output):
14    """Normalize algorithm output to a causallearn Graph object."""
15    if isinstance(graph_output, dict) and "G" in graph_output:

Method Summary

Auto-summarized from each method's code by an LLM reviewer — not the model's original output. Browse via the picker below; the Code section is independent.

Baselines

Agents

Claude Opus 4.6·Pseudocodehigh

GES + size-gated G-test pruning

Run GES with BDeu, then on graphs with more than 20 variables drop edges that pass a chi-squared CI test conditioned on safe-cardinality neighbour subsets.

1.  $G \leftarrow \mathrm{GES}(X,\,\text{BDeu})$ 
2. if  $p \le 20$  return  $G$ 
3. for each edge  $(i,j) \in G$ :
4.    $\mathcal{N} \leftarrow \mathrm{adj}(i) \cup \mathrm{adj}(j) \setminus \{i,j\}$ 
5.   build candidate sep-sets  $\mathcal{S} = \{\emptyset, \{k\}_{k\in\mathcal{N}}, \mathcal{N}\}$ 
6.   keep only  $S$  with  $r_i\,r_j\,\prod_{k\in S} r_k \le n/5$ 
7.   if  $\exists S$  with G-test p-value  $> 0.01$ , remove edge  $(i,j)$ 
8. return  $G$

Δ vs. baselineStarts from the GES baseline but adds a post-hoc G-test (gsq) edge-pruning pass restricted to neighbourhood-only conditioning sets and only triggered for large graphs.

size_gate=20alpha=0.01card_budget=n/5↻Recovers Plain GES on small graphs

Discrete Causal Graph Discovery

Description

Causal Discovery on Discrete Bayesian Network Datasets (bnlearn)

Research Question

Background

Task

Evaluation Networks

Metrics

Reference baselines

Code

Method Summary

GES + size-gated G-test pruning

Results