ExplorerComputer ScienceCybersecurity
Research PaperResearchia:202605.20014

SAGE: Scalable Automatic Gating Ensemble for Confident Negative Harvesting in Fraud Detection

Sudheer Tubati

Abstract

Music streaming fraud, where bad actors artificially inflate stream counts to manipulate chart rankings and royalty payments, poses a significant threat to streaming services and legitimate content creators. Traditional fraud detection approaches struggle with a critical challenge: many legitimate edge cases, including super-fans and sleep-music sessions, exhibit activity patterns that closely mimic those of coordinated fraud. We present SAGE, a novel counterfactual-aware negative harvesting app...

Submitted: May 20, 2026Subjects: Cybersecurity; Computer Science

Description / Details

Music streaming fraud, where bad actors artificially inflate stream counts to manipulate chart rankings and royalty payments, poses a significant threat to streaming services and legitimate content creators. Traditional fraud detection approaches struggle with a critical challenge: many legitimate edge cases, including super-fans and sleep-music sessions, exhibit activity patterns that closely mimic those of coordinated fraud. We present SAGE, a novel counterfactual-aware negative harvesting approach that combines SimHash-based stratified sampling with a modular gating ensemble for confident negative identification from unlabeled data. Our ensemble architecture employs pluggable statistical gates (currently instantiated with Mahalanobis distance and k-NN density) with configurable voting thresholds enabling adaptive precision-recall trade-offs. This addresses the representation bias problem in Positive-Unlabeled learning by ensuring comprehensive coverage of rare behavioral cohorts through floor-constrained sampling. Evaluation demonstrates strong precision and recall on held-out data. The approach generalizes across fraud detection domains, achieving strong performance on both customer-level and artist-level fraud without modification to the core methodology.


Source: arXiv:2605.20157v1 - http://arxiv.org/abs/2605.20157v1 PDF: https://arxiv.org/pdf/2605.20157v1 Original Link: http://arxiv.org/abs/2605.20157v1

Please sign in to join the discussion.

No comments yet. Be the first to share your thoughts!

Access Paper
View Source PDF
Submission Info
Date:
May 20, 2026
Topic:
Computer Science
Area:
Cybersecurity
Comments:
0
Bookmark
SAGE: Scalable Automatic Gating Ensemble for Confident Negative Harvesting in Fraud Detection | Researchia