ExplorerComputational LinguisticsNLP
Research PaperResearchia:202604.06011

Reliability Gated Multi-Teacher Distillation for Low Resource Abstractive Summarization

Dipto Sumit

Abstract

We study multiteacher knowledge distillation for low resource abstractive summarization from a reliability aware perspective. We introduce EWAD (Entropy Weighted Agreement Aware Distillation), a token level mechanism that routes supervision between teacher distillation and gold supervision based on inter teacher agreement, and CPDP (Capacity Proportional Divergence Preservation), a geometric constraint on the student position relative to heterogeneous teachers. Across two Bangla datasets, 13 Ban...

Submitted: April 6, 2026Subjects: NLP; Computational Linguistics

Description / Details

We study multiteacher knowledge distillation for low resource abstractive summarization from a reliability aware perspective. We introduce EWAD (Entropy Weighted Agreement Aware Distillation), a token level mechanism that routes supervision between teacher distillation and gold supervision based on inter teacher agreement, and CPDP (Capacity Proportional Divergence Preservation), a geometric constraint on the student position relative to heterogeneous teachers. Across two Bangla datasets, 13 BanglaT5 ablations, and eight Qwen2.5 experiments, we find that logit level KD provides the most reliable gains, while more complex distillation improves semantic similarity for short summaries but degrades longer outputs. Cross lingual pseudo label KD across ten languages retains 71-122 percent of teacher ROUGE L at 3.2x compression. A human validated multi judge LLM evaluation further reveals calibration bias in single judge pipelines. Overall, our results show that reliability aware distillation helps characterize when multi teacher supervision improves summarization and when data scaling outweighs loss engineering.


Source: arXiv:2604.03192v1 - http://arxiv.org/abs/2604.03192v1 PDF: https://arxiv.org/pdf/2604.03192v1 Original Link: http://arxiv.org/abs/2604.03192v1

Please sign in to join the discussion.

No comments yet. Be the first to share your thoughts!

Access Paper
View Source PDF
Submission Info
Date:
Apr 6, 2026
Topic:
Computational Linguistics
Area:
NLP
Comments:
0
Bookmark
Reliability Gated Multi-Teacher Distillation for Low Resource Abstractive Summarization | Researchia