ExplorerResearch PaperBioinformatics
Research PaperResearchia:202601.09542646

A Hybrid Unsupervised Methodology on Artificial Intelligence Filtering for automatically processing cellular DNA-Encoded Library (DEL) Datasets

baburajanish@gmail.com

Abstract

Results Herein we report an innovative automatic method that enables the most promising hit identification from large quantities of cell-based DEL datasets with improved accuracy and efficiency. This processing workflow is based on a comprehensive unsupervised algorithm incorporating data pre-processing, feature extracting and outlier filtering, descriptor-based classification, similarity score ranking and active compound prediction. We performed methodology development with two DEL selection da...

Submitted: January 9, 2026Subjects: Bioinformatics; Research Paper

Description / Details

Results Herein we report an innovative automatic method that enables the most promising hit identification from large quantities of cell-based DEL datasets with improved accuracy and efficiency. This processing workflow is based on a comprehensive unsupervised algorithm incorporating data pre-processing, feature extracting and outlier filtering, descriptor-based classification, similarity score ranking and active compound prediction. We performed methodology development with two DEL selection datasets targeting insulin receptor (INSR) on live cells, from both ˜30 million- and 1.033 billion- membered libraries. The automated scheme has demonstrated high consistency with experimental results as well as self-adaptivity to on-cell DEL datasets with varied library scales. Extended methodology application to cellular thrombopoietin receptor (TPOR) further substantiated the algorithmic generalization capability regarding target proteins. Thus, this approach can serve as a widely applicable workflow automatically differentiating hit compounds and thereby facilitates drug development from candidate discovery.

Availability and Implementation The complete datasets, source code, and pre-trained models are made available at https://doi.org/10.5281/zenodo.17452392 and https://doi.org/10.5281/zenodo.17569557.

Issue Section: Original Paper Associate Editor: Jonathan Wren Collection: Bioinformatics Journals

Please sign in to join the discussion.

No comments yet. Be the first to share your thoughts!

Access Paper
Submission Info
Date:
Jan 9, 2026
Topic:
Research Paper
Area:
Bioinformatics
Comments:
0
Bookmark