causal-observational-linear-non-gaussian

Causal Inferencecausal-learnrigorous codebase

Description

Causal Discovery: Observational Linear Non-Gaussian Data

Objective

Implement a causal discovery algorithm that recovers the DAG structure from purely observational data generated by a Linear Non-Gaussian Acyclic Model (LiNGAM). Your code goes in bench/custom_algorithm.py.

Background

LiNGAM-based methods exploit non-Gaussian noise to achieve full DAG identifiability from observational data alone, going beyond the Markov Equivalence Class limit of constraint-based (PC) and score-based (GES) methods.

Evaluation Scenarios

Label	Graph type	Nodes	Samples	Noise
ER10	Erdos-Renyi	10	250	Exponential
ER15	Erdos-Renyi	15	500	Laplace
SF12	Scale-Free (BA)	12	300	Uniform
ER30	Erdos-Renyi	30	1000	Laplace
ER50	Erdos-Renyi	50	2000	Exponential
ER50-LowSample	Erdos-Renyi	50	250	Exponential
SF100	Scale-Free (BA)	100	1000	Uniform
ER20-Dense	Erdos-Renyi	20	500	Laplace

Metrics

All computed on the directed edge set (skeleton + direction must be correct):

F1 (primary ranking metric), SHD, Precision, Recall

Baselines

icalingam: ICA-based LiNGAM (Shimizu 2006)
directlingam: DirectLiNGAM (Shimizu 2011)

Code

custom_algorithm.py

EditableRead-only

1import numpy as np
2
3# =====================================================================
4# EDITABLE: implement run_causal_discovery below
5# =====================================================================
6def run_causal_discovery(X: np.ndarray) -> np.ndarray:
7    """
8    Input:  X of shape (n_samples, n_variables)
9    Output: adjacency matrix B of shape (n_variables, n_variables)
10            B[i, j] != 0  means j -> i  (follows causal-learn convention)
11    """
12    n = X.shape[1]
13    return np.zeros((n, n))
14# =====================================================================
15

run_eval.py

EditableRead-only

1"""Evaluation harness for the causal-observational-linear-non-gaussian task."""
2import argparse
3import os
4import sys
5
6sys.path.insert(0, os.path.dirname(os.path.abspath(__file__)))
7
8from data_gen import simulate_lingam
9from metrics import compute_metrics
10from custom_algorithm import run_causal_discovery
11
12
13def main():
14    parser = argparse.ArgumentParser(
15        description="Evaluate a causal discovery algorithm on synthetic LiNGAM data."

data_gen.py

EditableRead-only

1"""Synthetic linear non-Gaussian DAG data generator for LiNGAM benchmarking."""
2import numpy as np
3import networkx as nx
4
5
6def simulate_dag(n_nodes, graph_type, seed, er_prob=0.5, sf_m=2):
7    """Return a binary adjacency matrix for a random DAG.
8
9    Convention: adj[i, j] = 1 means i -> j  (i is a parent of j).
10    The DAG is enforced by keeping only edges i -> j with i < j, imposing
11    a topological ordering by node index.
12    """
13    rng = np.random.default_rng(seed)
14    graph_seed = int(rng.integers(0, 2**31 - 1))
15

metrics.py

EditableRead-only

1"""Evaluation metrics for directed causal graph recovery."""
2import numpy as np
3
4
5def compute_metrics(B_est, B_true, threshold=0.01):
6    """Compute SHD, F1, precision, and recall for directed edge recovery.
7
8    Convention: B[i, j] != 0 means j -> i.
9
10    SHD definition (each type counts as exactly 1 error):
11        - Reversed edge : correct skeleton edge but wrong direction
12        - Extra edge    : present in estimate but absent in truth (non-reversal)
13        - Missing edge  : present in truth but absent in estimate (non-reversal)
14
15    F1 / precision / recall are computed on the directed edge set

Results

Show per-seed results

Model	Type	shd SF100 ↓	f1 SF100 ↑	precision SF100 ↑	recall SF100 ↑	shd ER10 ↓	f1 ER10 ↑	precision ER10 ↑	recall ER10 ↑	shd SF12 ↓	f1 SF12 ↑	precision SF12 ↑	recall SF12 ↑	shd ER30 ↓	f1 ER30 ↑	precision ER30 ↑	recall ER30 ↑	shd ER50 ↓	f1 ER50 ↑	precision ER50 ↑	recall ER50 ↑
directlingam	baseline	7.333	0.988	0.975	1.000	0.667	0.976	0.956	1.000	0.667	0.984	0.968	1.000	2.333	0.990	0.979	1.000	3.000	0.994	0.988	1.000
icalingam	baseline	120.333	0.803	0.701	0.940	1.333	0.954	0.914	1.000	1.000	0.976	0.954	1.000	2.333	0.989	0.979	1.000	3.000	0.994	0.988	1.000
rcd	baseline	379.000	0.107	0.181	0.078	4.333	0.810	0.917	0.742	18.000	0.179	1.000	0.100	104.667	0.350	0.517	0.270	-	-	-	-
anthropic/claude-opus-4.6	vanilla	712.000	0.443	0.287	0.973	6.000	0.824	0.700	1.000	19.000	0.678	0.513	1.000	61.000	0.777	0.635	1.000	167.000	0.750	0.601	1.000
google/gemini-3.1-pro-preview	vanilla	1120.000	0.244	0.149	0.667	7.000	0.800	0.667	1.000	9.000	0.816	0.690	1.000	278.000	0.256	0.169	0.528	269.000	0.651	0.483	1.000
gpt-5.4-pro	vanilla	215.000	0.641	0.544	0.780	2.000	0.897	0.867	0.929	3.000	0.930	0.870	1.000	19.000	0.899	0.913	0.887	92.000	0.802	0.830	0.777
anthropic/claude-opus-4.6	agent	0.000	1.000	1.000	1.000	0.000	1.000	1.000	1.000	1.000	0.976	0.952	1.000	1.000	0.995	0.991	1.000	1.000	0.998	0.996	1.000
google/gemini-3.1-pro-preview	agent	392.000	0.476	0.358	0.708	0.000	1.000	1.000	1.000	1.000	0.976	0.952	1.000	1.000	0.995	0.991	1.000	3.000	0.994	0.988	1.000
gpt-5.4-pro	agent	17.000	0.972	0.948	0.997	0.000	1.000	1.000	1.000	1.000	0.976	0.952	1.000	1.000	0.995	0.991	1.000	21.000	0.957	0.987	0.928

Agent Conversations

anthropic/claude-opus-4.6

5 steps

google/gemini-3.1-pro-preview

7 steps

gpt-5.4-pro

5 steps