ml-clustering-algorithm
Description
Clustering Algorithm Design
Research Question
Design a novel clustering algorithm or distance metric that improves cluster quality across diverse dataset geometries — including convex blobs, non-convex shapes (moons), varied-density clusters, and real-world high-dimensional data (handwritten digits).
Background
Clustering is a fundamental unsupervised learning problem. Classic methods like K-Means assume convex, isotropic clusters; DBSCAN handles arbitrary shapes but requires careful tuning of the eps parameter. Modern advances include HDBSCAN (hierarchical density estimation, parameter-free cluster count), Spectral Clustering (graph Laplacian for non-convex clusters), and Density Peak Clustering (DPC, which identifies centers via local density and inter-peak distance). No single method dominates across all dataset structures, making this an open research question.
Task
Modify the CustomClustering class in scikit-learn/custom_clustering.py (lines 36--120) to implement a novel clustering algorithm. You may also modify the custom_distance function if your approach uses a custom distance metric.
Your algorithm must:
- Accept
n_clusters(int or None) andrandom_stateparameters - Implement
fit(X)that setsself.labels_and returnsself - Implement
predict(X)that returns integer cluster labels - Handle datasets with different structures (convex, non-convex, varied density, high-dimensional)
Interface
class CustomClustering(BaseEstimator, ClusterMixin):
def __init__(self, n_clusters=None, random_state=42): ...
def fit(self, X): # X: (n_samples, n_features) -> self
def predict(self, X): # X: (n_samples, n_features) -> labels (n_samples,)
Available imports (already in the FIXED section): numpy, sklearn.base.BaseEstimator, sklearn.base.ClusterMixin, sklearn.preprocessing.StandardScaler, sklearn.metrics.*. You may import any module from scikit-learn, numpy, or scipy.
Evaluation
- Datasets: blobs (5 Gaussian clusters), moons (2 half-circles), varied_density (3 clusters with different densities), digits (sklearn Digits, 10 classes, 64 features)
- Metrics: ARI (Adjusted Rand Index, higher is better), NMI (Normalized Mutual Information, higher is better), Silhouette Score (higher is better)
- Success = consistently improving over baselines across all four datasets
Code
1"""Custom clustering algorithm benchmark.23This script evaluates a clustering algorithm across multiple dataset types.4The agent should modify the EDITABLE section to implement a novel clustering5algorithm or distance metric that achieves high cluster quality.67Datasets (selected by $ENV):8- blobs: Isotropic Gaussian blobs (varying cluster sizes)9- moons: Two interleaving half-circles + noise10- varied_density: Clusters with different densities and sizes11- digits: Real-world: sklearn Digits (8x8 images of handwritten digits)1213Metrics: ARI (Adjusted Rand Index), NMI (Normalized Mutual Information),14Silhouette Score15"""
Results
| Model | Type | ari blobs ↑ | nmi blobs ↑ | silhouette blobs ↑ | ari moons ↑ | nmi moons ↑ | silhouette moons ↑ | ari digits ↑ | nmi digits ↑ | silhouette digits ↑ |
|---|---|---|---|---|---|---|---|---|---|---|
| dbscan | baseline | 0.692 | 0.809 | 0.625 | 0.978 | 0.953 | 0.273 | 0.001 | 0.018 | -1.000 |
| hdbscan | baseline | 0.767 | 0.858 | 0.651 | 0.999 | 0.996 | 0.372 | 0.229 | 0.576 | 0.020 |
| kmeans | baseline | 0.853 | 0.874 | 0.585 | 0.481 | 0.383 | 0.494 | 0.534 | 0.671 | 0.139 |
| anthropic/claude-opus-4.6 | vanilla | 0.936 | 0.937 | 0.665 | 1.000 | 1.000 | 0.385 | 0.643 | 0.759 | 0.136 |
| deepseek-reasoner | vanilla | 0.770 | 0.827 | 0.404 | 0.019 | 0.015 | 0.304 | 0.556 | 0.758 | 0.105 |
| google/gemini-3.1-pro-preview | vanilla | 0.942 | 0.941 | 0.664 | 1.000 | 1.000 | 0.385 | 0.404 | 0.604 | 0.106 |
| openai/gpt-5.4 | vanilla | 0.939 | 0.939 | 0.664 | 1.000 | 1.000 | 0.385 | 0.642 | 0.762 | 0.131 |
| qwen/qwen3.6-plus | vanilla | 0.939 | 0.941 | 0.666 | 1.000 | 1.000 | 0.385 | 0.658 | 0.773 | 0.137 |
| anthropic/claude-opus-4.6 | agent | 0.939 | 0.939 | 0.665 | 1.000 | 1.000 | 0.385 | 0.664 | 0.779 | 0.137 |
| deepseek-reasoner | agent | 0.743 | 0.806 | 0.404 | 0.002 | 0.002 | 0.182 | 0.649 | 0.747 | 0.085 |
| google/gemini-3.1-pro-preview | agent | 0.941 | 0.942 | 0.664 | 1.000 | 1.000 | 0.385 | 0.666 | 0.786 | 0.135 |
| openai/gpt-5.4 | agent | 0.939 | 0.939 | 0.664 | 1.000 | 1.000 | 0.385 | 0.642 | 0.762 | 0.131 |
| qwen/qwen3.6-plus | agent | 0.939 | 0.941 | 0.666 | 1.000 | 1.000 | 0.385 | 0.658 | 0.773 | 0.137 |