ExplorerData ScienceMachine Learning
Research PaperResearchia:202603.13005

Matching Features, Not Tokens: Energy-Based Fine-Tuning of Language Models

Samy Jelassi

Abstract

Cross-entropy (CE) training provides dense and scalable supervision for language models, but it optimizes next-token prediction under teacher forcing rather than sequence-level behavior under model rollouts. We introduce a feature-matching objective for language-model fine-tuning that targets sequence-level statistics of the completion distribution, providing dense semantic feedback without requiring a task-specific verifier or preference model. To optimize this objective efficiently, we propose...

Submitted: March 13, 2026Subjects: Machine Learning; Data Science

Description / Details

Cross-entropy (CE) training provides dense and scalable supervision for language models, but it optimizes next-token prediction under teacher forcing rather than sequence-level behavior under model rollouts. We introduce a feature-matching objective for language-model fine-tuning that targets sequence-level statistics of the completion distribution, providing dense semantic feedback without requiring a task-specific verifier or preference model. To optimize this objective efficiently, we propose energy-based fine-tuning (EBFT), which uses strided block-parallel sampling to generate multiple rollouts from nested prefixes concurrently, batches feature extraction over these rollouts, and uses the resulting embeddings to perform an on-policy policy-gradient update. We present a theoretical perspective connecting EBFT to KL-regularized feature-matching and energy-based modeling. Empirically, across Q&A coding, unstructured coding, and translation, EBFT matches RLVR and outperforms SFT on downstream accuracy while achieving a lower validation cross-entropy than both methods.


Source: arXiv:2603.12248v1 - http://arxiv.org/abs/2603.12248v1 PDF: https://arxiv.org/pdf/2603.12248v1 Original Link: http://arxiv.org/abs/2603.12248v1

Please sign in to join the discussion.

No comments yet. Be the first to share your thoughts!

Access Paper
View Source PDF
Submission Info
Date:
Mar 13, 2026
Topic:
Data Science
Area:
Machine Learning
Comments:
0
Bookmark
Matching Features, Not Tokens: Energy-Based Fine-Tuning of Language Models | Researchia