AITopics | order matter

Order Matters in the Presence of Dataset Imbalance for Multilingual Learning

Neural Information Processing SystemsDec-26-2025, 21:03:40 GMT

In this paper, we empirically study the optimization dynamics of multi-task learning, particularly focusing on those that govern a collection of tasks with significant data imbalance. We present a simple yet effective method of pre-training on high-resource tasks, followed by fine-tuning on a mixture of high/low-resource tasks. We provide a thorough empirical study and analysis of this method's benefits showing that it achieves consistent improvements relative to the performance trade-off profile of standard static weighting. We analyze under what data regimes this method is applicable and show its improvements empirically in neural machine translation (NMT) and multi-lingual language modeling.

dataset imbalance, name change, order matter, (2 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Add feedback

Order Matters in the Presence of Dataset Imbalance for Multilingual Learning

Neural Information Processing SystemsMay-27-2025, 11:48:29 GMT

In this paper, we empirically study the optimization dynamics of multi-task learning, particularly focusing on those that govern a collection of tasks with significant data imbalance. We present a simple yet effective method of pre-training on high-resource tasks, followed by fine-tuning on a mixture of high/low-resource tasks. We provide a thorough empirical study and analysis of this method's benefits showing that it achieves consistent improvements relative to the performance trade-off profile of standard static weighting. We analyze under what data regimes this method is applicable and show its improvements empirically in neural machine translation (NMT) and multi-lingual language modeling.

artificial intelligence, machine learning, natural language, (4 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Order Matters in the Presence of Dataset Imbalance for Multilingual Learning

Neural Information Processing SystemsJan-19-2025, 23:13:36 GMT

In this paper, we empirically study the optimization dynamics of multi-task learning, particularly focusing on those that govern a collection of tasks with significant data imbalance. We present a simple yet effective method of pre-training on high-resource tasks, followed by fine-tuning on a mixture of high/low-resource tasks. We provide a thorough empirical study and analysis of this method's benefits showing that it achieves consistent improvements relative to the performance trade-off profile of standard static weighting. We analyze under what data regimes this method is applicable and show its improvements empirically in neural machine translation (NMT) and multi-lingual language modeling.

dataset imbalance, multilingual learning, order matter

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Order Matters: Probabilistic Modeling of Node Sequence for Graph Generation

Chen, Xiaohui, Han, Xu, Hu, Jiajing, Ruiz, Francisco J. R., Liu, Liping

arXiv.org Machine LearningJun-14-2021

A graph generative model defines a distribution over graphs. One type of generative model is constructed by autoregressive neural networks, which sequentially add nodes and edges to generate a graph. However, the likelihood of a graph under the autoregressive model is intractable, as there are numerous sequences leading to the given graph; this makes maximum likelihood estimation challenging. Instead, in this work we derive the exact joint probability over the graph and the node ordering of the sequential process. From the joint, we approximately marginalize out the node orderings and compute a lower bound on the log-likelihood using variational inference. We train graph generative models by maximizing this bound, without using the ad-hoc node orderings of previous methods. Our experiments show that the log-likelihood bound is significantly tighter than the bound of previous schemes. Moreover, the models fitted with the proposed algorithm can generate high-quality graphs that match the structures of target graphs not seen during training. We have made our code publicly available at \hyperref[https://github.com/tufts-ml/graph-generation-vi]{https://github.com/tufts-ml/graph-generation-vi}.

adjacency matrix, graph, node, (14 more...)

arXiv.org Machine Learning

2106.06189

Country:

Oceania > Australia > Australian Capital Territory > Canberra (0.04)
North America > United States > Massachusetts > Middlesex County > Medford (0.04)
North America > United States > Illinois > Cook County > Chicago (0.04)
(2 more...)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.54)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Order Matters: Alibaba's Transformer-based Recommender System

#artificialintelligenceAug-25-2019, 06:36:35 GMT

Alibaba, the largest e-commerce platform in China, is a powerhouse not only when it comes to e-commerce, but also when it comes to recommender systems research. Their latest paper, Behaviour Sequence Transformer for E-commerce Recommendation in Alibaba, is yet another publication that pushes the state of the art in recommender systems. In this work, they make use of the popular Transformer model to capture sequential signals in user behaviour in online shopping, in order to perform next click prediction. Recommender systems often make use of a 2-stage paradigm of retrieval and ranking, and Alibaba's approach is no different. The retrieval step used at Alibaba consists of selecting, with high recall, a subset of a million relevant candidate items from the entire item set (which is of course much larger than a million possible items), and the ranking step consists of ranking these candidates with high precision.

artificial intelligence, deep learning, machine learning, (17 more...)

#artificialintelligence

Country: Asia > China (0.25)

Industry: Information Technology > Services > e-Commerce Services (0.98)

Technology: