Back to Explorer
Research PaperResearchia:202602.10068[Pharmaceutical Research > Biochemistry]

SEISMO: Increasing Sample Efficiency in Molecular Optimization with a Trajectory-Aware LLM Agent

Fabian P. Krüger

Abstract

Optimizing the structure of molecules to achieve desired properties is a central bottleneck across the chemical sciences, particularly in the pharmaceutical industry where it underlies the discovery of new drugs. Since molecular property evaluation often relies on costly and rate-limited oracles, such as experimental assays, molecular optimization must be highly sample-efficient. To address this, we introduce SEISMO, an LLM agent that performs strictly online, inference-time molecular optimization, updating after every oracle call without the need for population-based or batched learning. SEISMO conditions each proposal on the full optimization trajectory, combining natural-language task descriptions with scalar scores and, when available, structured explanatory feedback. Across the Practical Molecular Optimization benchmark of 23 tasks, SEISMO achieves a 2-3 times higher area under the optimisation curve than prior methods, often reaching near-maximal task scores within 50 oracle calls. Our additional medicinal-chemistry tasks show that providing explanatory feedback further improves efficiency, demonstrating that leveraging domain knowledge and structured information is key to sample-efficient molecular optimization.


Source: arXiv:2602.00663v1 - http://arxiv.org/abs/2602.00663v1 PDF: https://arxiv.org/pdf/2602.00663v1 Original Link: http://arxiv.org/abs/2602.00663v1

Submission:2/10/2026
Comments:0 comments
Subjects:Biochemistry; Pharmaceutical Research
Original Source:
View Original PDF
arXiv: This paper is hosted on arXiv, an open-access repository
Was this helpful?

Discussion (0)

Please sign in to join the discussion.

No comments yet. Be the first to share your thoughts!