ExplorerData ScienceStatistics
Research PaperResearchia:202603.26028

Detection of local geometry in random graphs: information-theoretic and computational limits

Jinho Bok

Abstract

We study the problem of detecting local geometry in random graphs. We introduce a model $\mathcal{G}(n, p, d, k)$, where a hidden community of average size $k$ has edges drawn as a random geometric graph on $\mathbb{S}^{d-1}$, while all remaining edges follow the Erdős--Rényi model $\mathcal{G}(n, p)$. The random geometric graph is generated by thresholding inner products of latent vectors on $\mathbb{S}^{d-1}$, with each edge having marginal probability equal to $p$. This implies that $\mathcal...

Submitted: March 26, 2026Subjects: Statistics; Data Science

Description / Details

We study the problem of detecting local geometry in random graphs. We introduce a model G(n,p,d,k)\mathcal{G}(n, p, d, k), where a hidden community of average size kk has edges drawn as a random geometric graph on Sd1\mathbb{S}^{d-1}, while all remaining edges follow the Erdős--Rényi model G(n,p)\mathcal{G}(n, p). The random geometric graph is generated by thresholding inner products of latent vectors on Sd1\mathbb{S}^{d-1}, with each edge having marginal probability equal to pp. This implies that G(n,p,d,k)\mathcal{G}(n, p, d, k) and G(n,p)\mathcal{G}(n, p) are indistinguishable at the level of the marginals, and the signal lies entirely in the edge dependencies induced by the local geometry. We investigate both the information-theoretic and computational limits of detection. On the information-theoretic side, our upper bounds follow from three tests based on signed triangle counts: a global test, a scan test, and a constrained scan test; our lower bounds follow from two complementary methods: truncated second moment via Wishart--GOE comparison, and tensorization of KL divergence. These results together settle the detection threshold at d=Θ~(k2k6/n3)d = \widetildeΘ(k^2 \vee k^6/n^3) for fixed pp, and extend the state-of-the-art bounds from the full model (i.e., k=nk = n) for vanishing pp. On the computational side, we identify a computational--statistical gap and provide evidence via the low-degree polynomial framework, as well as the suboptimality of signed cycle counts of length 4\ell \geq 4.


Source: arXiv:2603.24545v1 - http://arxiv.org/abs/2603.24545v1 PDF: https://arxiv.org/pdf/2603.24545v1 Original Link: http://arxiv.org/abs/2603.24545v1

Please sign in to join the discussion.

No comments yet. Be the first to share your thoughts!

Access Paper
View Source PDF
Submission Info
Date:
Mar 26, 2026
Topic:
Data Science
Area:
Statistics
Comments:
0
Bookmark