ExplorerComputer ScienceCybersecurity
Research PaperResearchia:202603.30013

Machine Learning Transferability for Malware Detection

César Vieira

Abstract

Malware continues to be a predominant operational risk for organizations, especially when obfuscation techniques are used to evade detection. Despite the ongoing efforts in the development of Machine Learning (ML) detection approaches, there is still a lack of feature compatibility in public datasets. This limits generalization when facing distribution shifts, as well as transferability to different datasets. This study evaluates the suitability of different data preprocessing approaches for the...

Submitted: March 30, 2026Subjects: Cybersecurity; Computer Science

Description / Details

Malware continues to be a predominant operational risk for organizations, especially when obfuscation techniques are used to evade detection. Despite the ongoing efforts in the development of Machine Learning (ML) detection approaches, there is still a lack of feature compatibility in public datasets. This limits generalization when facing distribution shifts, as well as transferability to different datasets. This study evaluates the suitability of different data preprocessing approaches for the detection of Portable Executable (PE) files with ML models. The preprocessing pipeline unifies EMBERv2 (2,381-dim) features datasets, trains paired models under two training setups: EMBER + BODMAS and EMBER + BODMAS + ERMDS. Regarding model evaluation, both EMBER + BODMAS and EMBER + BODMAS + ERMDS models are tested against TRITIUM, INFERNO and SOREL-20M. ERMDS is also used for testing for the EMBER + BODMAS setup.


Source: arXiv:2603.26632v1 - http://arxiv.org/abs/2603.26632v1 PDF: https://arxiv.org/pdf/2603.26632v1 Original Link: http://arxiv.org/abs/2603.26632v1

Please sign in to join the discussion.

No comments yet. Be the first to share your thoughts!

Access Paper
View Source PDF
Submission Info
Date:
Mar 30, 2026
Topic:
Computer Science
Area:
Cybersecurity
Comments:
0
Bookmark
Machine Learning Transferability for Malware Detection | Researchia