ExplorerArtificial IntelligenceAI
Research PaperResearchia:202607.01047

AxDafny: Agentic Verified Code Generation in Dafny

Benjamin Breen

Abstract

We study agentic code generation in Dafny, where a model must generate both executable code and the proof artifacts for verification. We present AxDafny, a verifier-guided repair framework that iteratively generates implementations, invariants, assertions, and termination arguments. We also introduce LiveCodeBench-Pro-Dafny (LCB-Pro-Dafny), a benchmark of 250 competition-style programming problems translated into Dafny with formal specifications and a verifier-based evaluation harness. On LCB-Pr...

Submitted: July 1, 2026Subjects: AI; Artificial Intelligence

Description / Details

We study agentic code generation in Dafny, where a model must generate both executable code and the proof artifacts for verification. We present AxDafny, a verifier-guided repair framework that iteratively generates implementations, invariants, assertions, and termination arguments. We also introduce LiveCodeBench-Pro-Dafny (LCB-Pro-Dafny), a benchmark of 250 competition-style programming problems translated into Dafny with formal specifications and a verifier-based evaluation harness. On LCB-Pro-Dafny, AxDafny substantially improves verification success over baseline GPT-5.5 performance. On DafnyBench, AxDafny achieves 92.7% verification success, outperforming the strongest previously reported proof-hint baseline by 6.5 percentage points. Lastly, we show that verification success and runtime test performance measure different aspects of generated code.


Source: arXiv:2606.32007v1 - http://arxiv.org/abs/2606.32007v1 PDF: https://arxiv.org/pdf/2606.32007v1 Original Link: http://arxiv.org/abs/2606.32007v1

Please sign in to join the discussion.

No comments yet. Be the first to share your thoughts!

Access Paper
View Source PDF
Submission Info
Date:
Jul 1, 2026
Topic:
Artificial Intelligence
Area:
AI
Comments:
0
Bookmark