Multimodal Unsupervised Image-to-Image Translation

Abstract

Unsupervised image-to-image translation is an important and challenging problem in computer vision. Given an image in the source domain, the goal is to learn the conditional distribution of corresponding images in the target domain, without seeing any pairs of corresponding images. While this conditional distribution is inherently multimodal, existing approaches make an overly simplified assumption, modeling it as a deterministic one-to-one mapping. As a result, they fail to generate diverse outputs from a given source domain image. To address this limitation, we propose a Multimodal Unsupervised Image-to-image Translation (MUNIT) framework. We assume that the image representation can be decomposed into a content code that is domain-invariant, and a style code that captures domain-specific properties. To translate an image to another domain, we recombine its content code with a random style code sampled from the style space of the target domain. We analyze the proposed framework and establish several theoretical results. Extensive experiments with comparisons to the state-of-the-art approaches further demonstrates the advantage of the proposed framework. Moreover, our framework allows users to control the style of translation outputs by providing an example style image. Code and pretrained models are available at this https URL

Keywords

Computer scienceImage translationImage (mathematics)Code (set theory)Domain (mathematical analysis)Artificial intelligenceTranslation (biology)Source codePattern recognition (psychology)Computer visionTheoretical computer scienceMathematicsProgramming language

Affiliated Institutions

Related Publications

Moment Matching for Multi-Source Domain Adaptation

Xingchao Peng , Qinxun Bai , Xide Xia +3 more

Conventional unsupervised domain adaptation (UDA) assumes that training data are sampled from a single domain. This neglects the more practical scenario where training data are ...

2019 2019 IEEE/CVF International Conferenc... 1443 citations

Image Style Transfer Using Convolutional Neural Networks

Leon A. Gatys , Alexander S. Ecker , Matthias Bethge

Rendering the semantic content of an image in different styles is a difficult image processing task. Arguably, a major limiting factor for previous approaches has been the lack ...

2016 5772 citations

Split and Match: Example-Based Adaptive Patch Sampling for Unsupervised Style Transfer

Oriel Frigo , N. Sabaté , Julie Delon +1 more

This paper presents a novel unsupervised method to transfer the style of an example image to a source image. The complex notion of image style is here considered as a local text...

2016 130 citations

Return of Frustratingly Easy Domain Adaptation

Baochen Sun , Jiashi Feng , Kate Saenko

Unlike human learning, machine learning often fails to handle changes between training (source) and test (target) input distributions. Such domain shifts, common in practical sc...

2016 Proceedings of the AAAI Conference on... 1806 citations

Palette: Image-to-Image Diffusion Models

Chitwan Saharia , William Chan , Huiwen Chang +5 more

This paper develops a unified framework for image-to-image translation based on conditional diffusion models and evaluates this framework on four challenging image-to-image tran...

2022 1305 citations

Publication Info

Year: 2018
Type: book-chapter
Pages: 179-196
Citations: 2457
Access: Closed

External Links

Download PDF (Free) View on DOI.org arXiv PubMed Semantic Scholar

Social Impact

Altmetric

Multimodal Unsupervised Image-to-Image Translation

PlumX Metrics

Social media, news, blog, policy document mentions

Citation Metrics

2457

OpenAlex

488

Influential

Cite This

APA Style

                            
                                    Xun Huang, 
                                
                                    Ming-Yu Liu, 
                                
                                    Serge Belongie
                                
                                et al.
                            
                            (2018). 
                            Multimodal Unsupervised Image-to-Image Translation. 
                            Lecture notes in computer science
                            
                            , 179-196.
                            https://doi.org/10.1007/978-3-030-01219-9_11

Identifiers

DOI: 10.1007/978-3-030-01219-9_11
PMID: 41377025
PMCID: PMC12686293
arXiv: 1804.04732

Data Quality

Data completeness: 79%