Modeling Task Relationships in Multi-task Learning with Multi-gate Mixture-of-Experts

Jiaqi Ma; Zhe Zhao; Xinyang Yi; Jilin Chen; Lichan Hong; Ed H.

doi:10.1145/3219819.3220007

Abstract

Neural-based multi-task learning has been successfully used in many real-world large-scale applications such as recommendation systems. For example, in movie recommendations, beyond providing users movies which they tend to purchase and watch, the system might also optimize for users liking the movies afterwards. With multi-task learning, we aim to build a single model that learns these multiple goals and tasks simultaneously. However, the prediction quality of commonly used multi-task models is often sensitive to the relationships between tasks. It is therefore important to study the modeling tradeoffs between task-specific objectives and inter-task relationships. In this work, we propose a novel multi-task learning approach, Multi-gate Mixture-of-Experts (MMoE), which explicitly learns to model task relationships from data. We adapt the Mixture-of-Experts (MoE) structure to multi-task learning by sharing the expert submodels across all tasks, while also having a gating network trained to optimize each task. To validate our approach on data with different levels of task relatedness, we first apply it to a synthetic dataset where we control the task relatedness. We show that the proposed approach performs better than baseline methods when the tasks are less related. We also show that the MMoE structure results in an additional trainability benefit, depending on different levels of randomness in the training data and model initialization. Furthermore, we demonstrate the performance improvements by MMoE on real tasks including a binary classification benchmark, and a large-scale content recommendation system at Google.

Keywords

Computer scienceTask (project management)InitializationMachine learningArtificial intelligenceBenchmark (surveying)Multi-task learningTask analysisBaseline (sea)Artificial neural network

Affiliated Institutions

Related Publications

A Multi-View Deep Learning Approach for Cross Domain User Modeling in Recommendation Systems

Ali Elkahky , Yang Song , Xiaodong He

Recent online services rely heavily on automatic personalization to recommend relevant content to a large number of users. This requires systems to scale promptly to accommodate...

2015 710 citations

Learning Image and User Features for Recommendation in Social Networks

Xue Geng , Hanwang Zhang , Jingwen Bian +1 more

Good representations of data do help in many machine learning tasks such as recommendation. It is often a great challenge for traditional recommender systems to learn representa...

2015 225 citations

BERT4Rec

Fei Sun , Jun Liu , Jian Wu +4 more

Modeling users' dynamic preferences from their historical behaviors is challenging and crucial for recommendation systems. Previous methods employ sequential neural networks to ...

2019 Proceedings of the 28th ACM Internati... 1977 citations

Pedestrian detection aided by deep learning semantic tasks

Yonglong Tian , Ping Luo , Xiaogang Wang +1 more

Deep learning methods have achieved great successes in pedestrian detection, owing to its ability to learn discriminative features from raw pixels. However, they treat pedestria...

2015 418 citations

Multi-source Deep Learning for Human Pose Estimation

Wanli Ouyang , Xiao Chu , Xiaogang Wang

Visual appearance score, appearance mixture type and deformation are three important information sources for human pose estimation. This paper proposes to build a multi-source d...

2014 273 citations

Publication Info

Year: 2018
Type: article
Pages: 1930-1939
Citations: 987
Access: Closed

External Links

View on DOI.org

Social Impact

Altmetric

Modeling Task Relationships in Multi-task Learning with Multi-gate Mixture-of-Experts

PlumX Metrics

Social media, news, blog, policy document mentions

Citation Metrics

987

OpenAlex

Cite This

APA Style

                            
                                    Jiaqi Ma, 
                                
                                    Zhe Zhao, 
                                
                                    Xinyang Yi
                                
                                et al.
                            
                            (2018). 
                            Modeling Task Relationships in Multi-task Learning with Multi-gate Mixture-of-Experts. 
                            
                            , 1930-1939.
                            https://doi.org/10.1145/3219819.3220007

Identifiers

DOI: 10.1145/3219819.3220007