Back to Explorer
Research PaperResearchia:202602.12046[Artificial Intelligence > AI]

CODE-SHARP: Continuous Open-ended Discovery and Evolution of Skills as Hierarchical Reward Programs

Richard Bornemann

Abstract

Developing agents capable of open-endedly discovering and learning novel skills is a grand challenge in Artificial Intelligence. While reinforcement learning offers a powerful framework for training agents to master complex skills, it typically relies on hand-designed reward functions. This is infeasible for open-ended skill discovery, where the set of meaningful skills is not known a priori. While recent methods have shown promising results towards automating reward function design, they remain limited to refining rewards for pre-defined tasks. To address this limitation, we introduce Continuous Open-ended Discovery and Evolution of Skills as Hierarchical Reward Programs (CODE-SHARP), a novel framework leveraging Foundation Models (FM) to open-endedly expand and refine a hierarchical skill archive, structured as a directed graph of executable reward functions in code. We show that a goal-conditioned agent trained exclusively on the rewards generated by the discovered SHARP skills learns to solve increasingly long-horizon goals in the Craftax environment. When composed by a high-level FM-based planner, the discovered skills enable a single goal-conditioned agent to solve complex, long-horizon tasks, outperforming both pretrained agents and task-specific expert policies by over 134134% on average. We will open-source our code and provide additional videos \href\href{https://sites.google.com/view/code-sharp/homepage}{here}.


Source: arXiv:2602.10085v1 - http://arxiv.org/abs/2602.10085v1 PDF: https://arxiv.org/pdf/2602.10085v1 Original Link: http://arxiv.org/abs/2602.10085v1

Submission:2/12/2026
Comments:0 comments
Subjects:AI; Artificial Intelligence
Original Source:
View Original PDF
arXiv: This paper is hosted on arXiv, an open-access repository
Was this helpful?

Discussion (0)

Please sign in to join the discussion.

No comments yet. Be the first to share your thoughts!

CODE-SHARP: Continuous Open-ended Discovery and Evolution of Skills as Hierarchical Reward Programs | Researchia | Researchia