Back to Explorer
Research PaperResearchia:202602.11050[Data Science > Machine Learning]

Evaluating Disentangled Representations for Controllable Music Generation

Laura Ibåñez-Martínez

Abstract

Recent approaches in music generation rely on disentangled representations, often labeled as structure and timbre or local and global, to enable controllable synthesis. Yet the underlying properties of these embeddings remain underexplored. In this work, we evaluate such disentangled representations in a set of music audio models for controllable generation using a probing-based framework that goes beyond standard downstream tasks. The selected models reflect diverse unsupervised disentanglement strategies, including inductive biases, data augmentations, adversarial objectives, and staged training procedures. We further isolate specific strategies to analyze their effect. Our analysis spans four key axes: informativeness, equivariance, invariance, and disentanglement, which are assessed across datasets, tasks, and controlled transformations. Our findings reveal inconsistencies between intended and actual semantics of the embeddings, suggesting that current strategies fall short of producing truly disentangled representations, and prompting a re-examination of how controllability is approached in music generation.


Source: arXiv:2602.10058v1 - http://arxiv.org/abs/2602.10058v1 PDF: https://arxiv.org/pdf/2602.10058v1 Original Link: http://arxiv.org/abs/2602.10058v1

Submission:2/11/2026
Comments:0 comments
Subjects:Machine Learning; Data Science
Original Source:
View Original PDF
arXiv: This paper is hosted on arXiv, an open-access repository
Was this helpful?

Discussion (0)

Please sign in to join the discussion.

No comments yet. Be the first to share your thoughts!

Evaluating Disentangled Representations for Controllable Music Generation | Researchia | Researchia