Abstract

We present a new method for synthesizing high-resolution photo-realistic images from semantic label maps using conditional generative adversarial networks (conditional GANs). Conditional GANs have enabled a variety of applications, but the results are often limited to low-resolution and still far from realistic. In this work, we generate 2048 × 1024 visually appealing results with a novel adversarial loss, as well as new multi-scale generator and discriminator architectures. Furthermore, we extend our framework to interactive visual manipulation with two additional features. First, we incorporate object instance segmentation information, which enables object manipulations such as removing/adding objects and changing the object category. Second, we propose a method to generate diverse results given the same input, allowing users to edit the object appearance interactively. Human opinion studies demonstrate that our method significantly outperforms existing methods, advancing both the quality and the resolution of deep image synthesis and editing.

Keywords

Computer scienceDiscriminatorGenerator (circuit theory)Artificial intelligenceObject (grammar)Generative grammarImage (mathematics)Variety (cybernetics)SegmentationSemantics (computer science)Resolution (logic)Computer visionPattern recognition (psychology)

Affiliated Institutions

Related Publications

Publication Info

Year
2018
Type
article
Pages
8798-8807
Citations
4266
Access
Closed

External Links

Social Impact

Social media, news, blog, policy document mentions

Citation Metrics

4266
OpenAlex

Cite This

Ting-Chun Wang, Ming-Yu Liu, Jun-Yan Zhu et al. (2018). High-Resolution Image Synthesis and Semantic Manipulation with Conditional GANs. , 8798-8807. https://doi.org/10.1109/cvpr.2018.00917

Identifiers

DOI
10.1109/cvpr.2018.00917