FAST: Flexibly Controllable Arbitrary Style Transfer via Latent Diffusion Models


METADATA ONLY
Loading...

Date

2025-09

Publication Type

Journal Article

ETH Bibliography

yes

Citations

Web of Science:
Scopus:
Altmetric
METADATA ONLY

Data

Rights / License

Abstract

The goal of Arbitrary Style Transfer (AST) is injecting the artistic features of a style reference into a given image/video. Existing methods usually pursue the balance between style and content by adjusting general coarse-level stylized strength, thereby leading to unsatisfactory results and hindering their practical application. To address this critical issue, a novel AST approach namely Flexibly Controllable Arbitrary Style Transfer (FAST) is proposed, which is capable of explicitly customizing the stylization results according to various sources of semantic clues. In the specific, our model is constructed based on Latent Diffusion Model (LDM) and elaborately designed to absorb content and style instances as conditions of LDM. It is characterized by introducing Style-Adapter, which allows users to flexibly manipulate the stylization results via aligning multi-level style control information and intrinsic knowledge in LDM, meanwhile enhancing the model with improved capacity to harmonize content detail retention and stylization strength. Lastly, our model is extended to handle video AST task. A novel learning objective is leveraged for video diffusion model training, which considerably improves cross-frame temporal consistency on the premise of maintaining stylization strength. Qualitative and quantitative comparisons as well as user studies demonstrate our presented approach outperforms the existing SoTA methods in generating visually plausible stylization results. The project homepage for the article is available at: https://fast-ldm.github.io/.

Publication status

published

Editor

Book title

Volume

21 (9)

Pages / Article No.

268

Publisher

Association for Computing Machinery

Event

Edition / version

Methods

Software

Geographic location

Date collected

Date created

Subject

Arbitrary style transfer; diffusion model; style-adapter

Organisational unit

Notes

Funding

Related publications and datasets