ExplorerData ScienceMachine Learning
Research PaperResearchia:202605.05019

A Closed-Form Persistence-Landmark Pipeline for Certified Point-Cloud and Graph Classification

Sushovan Majhi

Abstract

We introduce PLACE (Persistence-Landmark Analytic Classification Engine), a closed-form pipeline for classifying point clouds and graphs through their persistent-homology signatures. Three quantitative guarantees -- a margin-based excess-risk rate, a closed-form descriptor-selection rule, and a per-prediction certificate -- are derived from training labels alone, with no learned weights or held-out calibration. The embedding sums Mitra-Virk single-point coordinate functions over a sparse landmar...

Submitted: May 5, 2026Subjects: Machine Learning; Data Science

Description / Details

We introduce PLACE (Persistence-Landmark Analytic Classification Engine), a closed-form pipeline for classifying point clouds and graphs through their persistent-homology signatures. Three quantitative guarantees -- a margin-based excess-risk rate, a closed-form descriptor-selection rule, and a per-prediction certificate -- are derived from training labels alone, with no learned weights or held-out calibration. The embedding sums Mitra-Virk single-point coordinate functions over a sparse landmark grid; closed-form weights maximize a structural distortion constant λ(ν)λ(ν) (a Lipschitz lower bound on Dn\mathcal{D}_n under non-interference). (i) An O(kR/(Δmmin))O(kR/(Δ\sqrt{m_{\min}})) margin bound, driven by class-mean separation ΔΔ and embedding radius RR, matched by a sample-starved minimax lower bound. (ii) The Mahalanobis margin under Ledoit-Wolf-shrunk covariance is the strongest closed-form descriptor selector on a heterogeneous 64-descriptor chemical-graph pool (mean Spearman ρ+0.54ρ\approx +0.54 across 10 benchmarks, positive on 9 of 10); the isotropic surrogate Δ/Δ/\sqrt\ell admits a closed-form selection-consistency rate on homogeneous (14-15 descriptor) protein/social pools. (iii) A training-time-decided certificate with no per-prediction overhead, in non-asymptotic Pinelis and asymptotic Gaussian plug-in forms. Empirically, PLACE is the strongest diagram-based method on Orbit5k and matches the strongest topology-based baseline within statistical noise on MUTAG and COX2. The remaining gaps fall into two diagnosable regimes: descriptor blindness on NCI1/NCI109, and pool-coverage limits elsewhere. Both radii exceed the firing threshold Δ^/2\hatΔ/2 on every benchmark at our training-set sizes, dominated by the \sqrt\ell scaling of the multivariate-norm bound; the per-prediction certificate is constructive but not yet operational at these sizes.


Source: arXiv:2605.02836v1 - http://arxiv.org/abs/2605.02836v1 PDF: https://arxiv.org/pdf/2605.02836v1 Original Link: http://arxiv.org/abs/2605.02836v1

Please sign in to join the discussion.

No comments yet. Be the first to share your thoughts!

Access Paper
View Source PDF
Submission Info
Date:
May 5, 2026
Topic:
Data Science
Area:
Machine Learning
Comments:
0
Bookmark