Diffusion models (DMs) synthesize high-quality images in various domains. However, controlling their generative process of pre-trained vanilla DMs is not as easy as generative adversarial networks due to the lack of semantic latent space. Recently, Asyrp discovers a hidden latent space of pre-trained vanilla DMs called $ h $-space where changes along certain directions result in attribute changes. In this paper, we dive deeper into Asyrp to discover a secret recipe for content injection and style transfer. 1) cumulatively adding $ h_t $ from different samples along the timesteps leads to content replacement, 2) the distribution of $ h_t $ should be maintained by normalization, 3) the distribution of $ x_t $ should be maintained by calibration, and 4) the content can be harmonized into arbitrary style images even unseen during training. To the best of our knowledge, our method introduces the first training-free feed-forward style transfer only with an unconditional pre-trained frozen generative network. The code will be available online.
(a) During the editing interval, target content is recursively injected to the asymmetric reverse process. (b) illustrates the green box in (a) in detail. (c) Ours also enables local content injection
© This webpage was in part inspired from this template.