AITopics | Mehryar Mohri

Many structured prediction problems admit a natural loss function for evaluation such as the edit-distance or n-gram loss. However, existing learning algorithms are typically designed to optimize alternative objectives such as the cross-entropy. This is because a naïve implementation of the natural loss functions often results in intractable gradient computations. In this paper, we design efficient gradient computation algorithms for two broad families of structured prediction loss functions: rational and tropical losses. These families include as special cases the n-gram loss, the edit-distance loss, and many other loss functions commonly used in natural language processing and computational biology tasks that are based on sequence similarity measures. Our algorithms make use of weighted automata and graph operations over appropriate semirings to design efficient solutions. They facilitate efficient gradient computation and hence enable one to train learning models such as neural networks with complex structured losses.

artificial intelligence, machine learning, natural language, (19 more...)

Neural Information Processing Systems

Country: North America > United States > Massachusetts (0.28)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.48)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback

Bandits with Feedback Graphs and Switching Costs

Raman Arora, Teodor Vanislavov Marinov, Mehryar Mohri

Neural Information Processing SystemsMar-26-2025, 23:13:30 GMT

We study the adversarial multi-armed bandit problem where the learner is supplied with partial observations modeled by a feedback graph and where shifting to a new action incurs a fixed switching cost.

artificial intelligence, data mining, machine learning, (20 more...)

Neural Information Processing Systems

Country: North America (0.46)

Industry: Education > Educational Setting > Online (0.30)

Technology:

Information Technology > Data Science > Data Mining > Big Data (1.00)
Information Technology > Artificial Intelligence > Machine Learning (0.71)

Add feedback

Learning GANs and Ensembles Using Discrepancy

Ben Adlam, Corinna Cortes, Mehryar Mohri, Ningshan Zhang

Neural Information Processing SystemsMar-26-2025, 15:26:29 GMT

Generative adversarial networks (GANs) generate data based on minimizing a divergence between two distributions. The choice of that divergence is therefore critical. We argue that the divergence must take into account the hypothesis set and the loss function used in a subsequent learning task, where the data generated by a GAN serves for training. Taking that structural information into account is also important to derive generalization guarantees. Thus, we propose to use the discrepancy measure, which was originally introduced for the closely related problem of domain adaptation and which precisely takes into account the hypothesis set and the loss function. We show that discrepancy admits favorable properties for training GANs and prove explicit generalization guarantees. We present efficient algorithms using discrepancy for two tasks: training a GAN directly, namely DGAN, and mixing previously trained generative models, namely EDGAN. Our experiments on toy examples and several benchmark datasets show that DGAN is competitive with other GANs and that EDGAN outperforms existing GAN ensembles, such as AdaGAN.

artificial intelligence, discrepancy, machine learning, (16 more...)

Neural Information Processing Systems

Country: North America (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Policy Regret in Repeated Games

Raman Arora, Michael Dinitz, Teodor Vanislavov Marinov, Mehryar Mohri

Neural Information Processing SystemsMar-26-2025, 11:40:49 GMT

Neural Information Processing Systems http://nips.cc/

artificial intelligence, machine learning, policy regret, (20 more...)

Neural Information Processing Systems

Country: North America (0.46)

Industry: Leisure & Entertainment > Games (0.68)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.31)

Add feedback

Algorithms and Theory for Multiple-Source Adaptation

Judy Hoffman, Mehryar Mohri, Ningshan Zhang

Neural Information Processing SystemsMar-23-2025, 22:30:25 GMT

Neural Information Processing Systems http://nips.cc/

artificial intelligence, machine learning, predictor, (17 more...)

Neural Information Processing Systems

Country: North America > United States (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.95)

Add feedback

Regularized Gradient Boosting

Corinna Cortes, Mehryar Mohri, Dmitry Storcheus

Neural Information Processing SystemsMar-23-2025, 09:13:37 GMT

Neural Information Processing Systems http://nips.cc/

algorithm, artificial intelligence, machine learning, (13 more...)

Neural Information Processing Systems

Country: North America (0.28)

Genre: Research Report > New Finding (0.88)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.68)

Add feedback

Bandits with Feedback Graphs and Switching Costs

Raman Arora, Teodor Vanislavov Marinov, Mehryar Mohri

Neural Information Processing SystemsJan-27-2025, 05:50:41 GMT

We study the adversarial multi-armed bandit problem where the learner is supplied with partial observations modeled by a feedback graph and where shifting to a new action incurs a fixed switching cost.

artificial intelligence, data mining, machine learning, (20 more...)

Neural Information Processing Systems

Country: North America (0.46)

Industry: Education > Educational Setting > Online (0.30)

Technology:

Information Technology > Data Science > Data Mining > Big Data (1.00)
Information Technology > Artificial Intelligence > Machine Learning (0.71)

Add feedback

Learning GANs and Ensembles Using Discrepancy

Ben Adlam, Corinna Cortes, Mehryar Mohri, Ningshan Zhang

Neural Information Processing SystemsJan-26-2025, 13:26:14 GMT

Generative adversarial networks (GANs) generate data based on minimizing a divergence between two distributions. The choice of that divergence is therefore critical. We argue that the divergence must take into account the hypothesis set and the loss function used in a subsequent learning task, where the data generated by a GAN serves for training. Taking that structural information into account is also important to derive generalization guarantees. Thus, we propose to use the discrepancy measure, which was originally introduced for the closely related problem of domain adaptation and which precisely takes into account the hypothesis set and the loss function. We show that discrepancy admits favorable properties for training GANs and prove explicit generalization guarantees. We present efficient algorithms using discrepancy for two tasks: training a GAN directly, namely DGAN, and mixing previously trained generative models, namely EDGAN. Our experiments on toy examples and several benchmark datasets show that DGAN is competitive with other GANs and that EDGAN outperforms existing GAN ensembles, such as AdaGAN.

artificial intelligence, discrepancy, machine learning, (16 more...)

Neural Information Processing Systems

Country: North America (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Regularized Gradient Boosting

Corinna Cortes, Mehryar Mohri, Dmitry Storcheus

Neural Information Processing SystemsJan-23-2025, 08:52:39 GMT

Gradient Boosting (GB) is a popular and very successful ensemble method for binary trees. While various types of regularization of the base predictors are used with this algorithm, the theory that connects such regularizations with generalization guarantees is poorly understood. We fill this gap by deriving data-dependent learning guarantees for GB used with regularization, expressed in terms of the Rademacher complexities of the constrained families of base predictors. We introduce a new algorithm, called RGB, that directly benefits from these generalization bounds and that, at every boosting round, applies the Structural Risk Minimization principle to search for a base predictor with the best empirical fit versus complexity trade-off. Inspired by Randomized Coordinate Descent we provide a scalable implementation of our algorithm, able to search over large families of base predictors. Finally, we provide experimental results, demonstrating that our algorithm achieves significantly better out-of-sample performance on multiple datasets than the standard GB algorithm used with its regularization.

algorithm, artificial intelligence, machine learning, (13 more...)

Neural Information Processing Systems

Country: North America (0.28)

Genre: Research Report > New Finding (0.88)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.88)

Add feedback

Optimistic Bandit Convex Optimization

Scott Yang, Mehryar Mohri

Neural Information Processing SystemsJan-20-2025, 18:05:46 GMT

We introduce the general and powerful scheme of predicting information re-use in optimization algorithms. This allows us to devise a computationally efficient algorithm for bandit convex optimization with new state-of-the-art guarantees for both Lipschitz loss functions and loss functions with Lipschitz gradients.

algorithm, artificial intelligence, machine learning, (15 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.34)

Add feedback

Filters

Collaborating Authors

Mehryar Mohri

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

Efficient Gradient Computation for Structured Output Learning with Rational and Tropical Losses

Bandits with Feedback Graphs and Switching Costs

Learning GANs and Ensembles Using Discrepancy

Policy Regret in Repeated Games

Algorithms and Theory for Multiple-Source Adaptation

Regularized Gradient Boosting

Bandits with Feedback Graphs and Switching Costs

Learning GANs and Ensembles Using Discrepancy

Regularized Gradient Boosting

Optimistic Bandit Convex Optimization