Back to Explorer
Research PaperResearchia:202603.06006[Computer Vision > Computer Vision]

FaceCam: Portrait Video Camera Control via Scale-Aware Conditioning

Weijie Lyu

Abstract

We introduce FaceCam, a system that generates video under customizable camera trajectories for monocular human portrait video input. Recent camera control approaches based on large video-generation models have shown promising progress but often exhibit geometric distortions and visual artifacts on portrait videos due to scale-ambiguous camera representations or 3D reconstruction errors. To overcome these limitations, we propose a face-tailored scale-aware representation for camera transformations that provides deterministic conditioning without relying on 3D priors. We train a video generation model on both multi-view studio captures and in-the-wild monocular videos, and introduce two camera-control data generation strategies: synthetic camera motion and multi-shot stitching, to exploit stationary training cameras while generalizing to dynamic, continuous camera trajectories at inference time. Experiments on Ava-256 dataset and diverse in-the-wild videos demonstrate that FaceCam achieves superior performance in camera controllability, visual quality, identity and motion preservation.


Source: arXiv:2603.05506v1 - http://arxiv.org/abs/2603.05506v1 PDF: https://arxiv.org/pdf/2603.05506v1 Original Link: http://arxiv.org/abs/2603.05506v1

Submission:3/6/2026
Comments:0 comments
Subjects:Computer Vision; Computer Vision
Original Source:
View Original PDF
arXiv: This paper is hosted on arXiv, an open-access repository
Was this helpful?

Discussion (0)

Please sign in to join the discussion.

No comments yet. Be the first to share your thoughts!