Back to Explorer
Research PaperResearchia:202603.11060[Artificial Intelligence > AI]

Benchmarking Language Modeling for Lossless Compression of Full-Fidelity Audio

Phillip Long

Abstract

Autoregressive "language" models (LMs) trained on raw waveforms can be repurposed for lossless audio compression, but prior work is limited to 8-bit audio, leaving open whether such approaches work for practical settings (16/24-bit) and can compete with existing codecs. We benchmark LM-based compression on full-fidelity audio across diverse domains (music, speech, bioacoustics), sampling rates (16kHz-48kHz), and bit depths (8, 16, 24-bit). Standard sample-level tokenization becomes intractable at higher bit depths due to vocabulary size (65K for 16-bit; 16.7M for 24-bit). We propose Trilobyte, a byte-level tokenization schema for full resolution audio, improving vocabulary scaling from O(2b)O(2^{b}) to O(1)O(1) and enabling the first tractable 24-bit LM-based lossless compression. While LMs consistently outperform FLAC and yield state-of-the-art compression at 8-bit and 16-bit, we observe that compression gains become more modest as bit depth increases beyond 8-bit.


Source: arXiv:2603.08683v1 - http://arxiv.org/abs/2603.08683v1 PDF: https://arxiv.org/pdf/2603.08683v1 Original Link: http://arxiv.org/abs/2603.08683v1

Submission:3/11/2026
Comments:0 comments
Subjects:AI; Artificial Intelligence
Original Source:
View Original PDF
arXiv: This paper is hosted on arXiv, an open-access repository
Was this helpful?

Discussion (0)

Please sign in to join the discussion.

No comments yet. Be the first to share your thoughts!

Benchmarking Language Modeling for Lossless Compression of Full-Fidelity Audio | Researchia