ExplorerArtificial IntelligenceAI
Research PaperResearchia:202604.17001

MM-WebAgent: A Hierarchical Multimodal Web Agent for Webpage Generation

Yan Li

Abstract

The rapid progress of Artificial Intelligence Generated Content (AIGC) tools enables images, videos, and visualizations to be created on demand for webpage design, offering a flexible and increasingly adopted paradigm for modern UI/UX. However, directly integrating such tools into automated webpage generation often leads to style inconsistency and poor global coherence, as elements are generated in isolation. We propose MM-WebAgent, a hierarchical agentic framework for multimodal webpage generat...

Submitted: April 17, 2026Subjects: AI; Artificial Intelligence

Description / Details

The rapid progress of Artificial Intelligence Generated Content (AIGC) tools enables images, videos, and visualizations to be created on demand for webpage design, offering a flexible and increasingly adopted paradigm for modern UI/UX. However, directly integrating such tools into automated webpage generation often leads to style inconsistency and poor global coherence, as elements are generated in isolation. We propose MM-WebAgent, a hierarchical agentic framework for multimodal webpage generation that coordinates AIGC-based element generation through hierarchical planning and iterative self-reflection. MM-WebAgent jointly optimizes global layout, local multimodal content, and their integration, producing coherent and visually consistent webpages. We further introduce a benchmark for multimodal webpage generation and a multi-level evaluation protocol for systematic assessment. Experiments demonstrate that MM-WebAgent outperforms code-generation and agent-based baselines, especially on multimodal element generation and integration. Code & Data: https://aka.ms/mm-webagent.


Source: arXiv:2604.15309v1 - http://arxiv.org/abs/2604.15309v1 PDF: https://arxiv.org/pdf/2604.15309v1 Original Link: http://arxiv.org/abs/2604.15309v1

Please sign in to join the discussion.

No comments yet. Be the first to share your thoughts!

Access Paper
View Source PDF
Submission Info
Date:
Apr 17, 2026
Topic:
Artificial Intelligence
Area:
AI
Comments:
0
Bookmark
MM-WebAgent: A Hierarchical Multimodal Web Agent for Webpage Generation | Researchia