Back to Explorer
Research PaperResearchia:202604.11050[Artificial Intelligence > AI]

RewardFlow: Generate Images by Optimizing What You Reward

Onkar Susladkar

Abstract

We introduce RewardFlow, an inversion-free framework that steers pretrained diffusion and flow-matching models at inference time through multi-reward Langevin dynamics. RewardFlow unifies complementary differentiable rewards for semantic alignment, perceptual fidelity, localized grounding, object consistency, and human preference, and further introduces a differentiable VQA-based reward that provides fine-grained semantic supervision through language-vision reasoning. To coordinate these heterogeneous objectives, we design a prompt-aware adaptive policy that extracts semantic primitives from the instruction, infers edit intent, and dynamically modulates reward weights and step sizes throughout sampling. Across several image editing and compositional generation benchmarks, RewardFlow delivers state-of-the-art edit fidelity and compositional alignment.


Source: arXiv:2604.08536v1 - http://arxiv.org/abs/2604.08536v1 PDF: https://arxiv.org/pdf/2604.08536v1 Original Link: http://arxiv.org/abs/2604.08536v1

Submission:4/11/2026
Comments:0 comments
Subjects:AI; Artificial Intelligence
Original Source:
View Original PDF
arXiv: This paper is hosted on arXiv, an open-access repository
Was this helpful?

Discussion (0)

Please sign in to join the discussion.

No comments yet. Be the first to share your thoughts!