AITopics | Maya Gupta

Stochastic classifiers arise in a number of machine learning problems, and have become especially prominent of late, as they often result from constrained optimization problems, e.g. for fairness, churn, or custom losses. Despite their utility, the inherent randomness of stochastic classifiers may cause them to be problematic to use in practice for a variety of practical reasons. In this paper, we attempt to answer the theoretical question of how well a stochastic classifier can be approximated by a deterministic one, and compare several different approaches, proving lower and upper bounds. We also experimentally investigate the pros and cons of these methods, not only in regard to how successfully each deterministic classifier approximates the original stochastic classifier, but also in terms of how well each addresses the other issues that can make stochastic classifiers undesirable.

classifier, machine learning, natural language, (20 more...)

Neural Information Processing Systems

Country: North America > United States (0.46)

Genre: Overview (0.34)

Industry: Education (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.95)
Information Technology > Artificial Intelligence > Natural Language (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.69)

Add feedback

Optimizing Generalized Rate Metrics with Three Players

Harikrishna Narasimhan, Andrew Cotter, Maya Gupta

Neural Information Processing SystemsMar-23-2025, 05:59:48 GMT

Neural Information Processing Systems http://nips.cc/

artificial intelligence, constraint, machine learning, (18 more...)

Neural Information Processing Systems

Country: North America > United States (0.46)

Industry:

Education (0.94)
Leisure & Entertainment > Games (0.68)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)

Add feedback

On Making Stochastic Classifiers Deterministic

Andrew Cotter, Maya Gupta, Harikrishna Narasimhan

Neural Information Processing SystemsJan-24-2025, 03:34:54 GMT

Stochastic classifiers arise in a number of machine learning problems, and have become especially prominent of late, as they often result from constrained optimization problems, e.g. for fairness, churn, or custom losses. Despite their utility, the inherent randomness of stochastic classifiers may cause them to be problematic to use in practice for a variety of practical reasons. In this paper, we attempt to answer the theoretical question of how well a stochastic classifier can be approximated by a deterministic one, and compare several different approaches, proving lower and upper bounds. We also experimentally investigate the pros and cons of these methods, not only in regard to how successfully each deterministic classifier approximates the original stochastic classifier, but also in terms of how well each addresses the other issues that can make stochastic classifiers undesirable.

classifier, machine learning, natural language, (20 more...)

Neural Information Processing Systems

Country: North America > United States (0.46)

Genre: Overview (0.34)

Industry: Education (0.92)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.96)
Information Technology > Artificial Intelligence > Natural Language (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.69)

Add feedback

Optimizing Generalized Rate Metrics with Three Players

Harikrishna Narasimhan, Andrew Cotter, Maya Gupta

Neural Information Processing SystemsJan-23-2025, 02:46:28 GMT

We present a general framework for solving a large class of learning problems with non-linear functions of classification rates. This includes problems where one wishes to optimize a non-decomposable performance metric such as the F-measure or G-mean, and constrained training problems where the classifier needs to satisfy non-linear rate constraints such as predictive parity fairness, distribution divergences or churn ratios. We extend previous two-player game approaches for constrained optimization to an approach with three players to decouple the classifier rates from the non-linear objective, and seek to find an equilibrium of the game. Our approach generalizes many existing algorithms, and makes possible new algorithms with more flexibility and tighter handling of non-linear rate constraints. We provide convergence guarantees for convex functions of rates, and show how our methodology can be extended to handle sums-of-ratios of rates. Experiments on different fairness tasks confirm the efficacy of our approach.

artificial intelligence, constraint, machine learning, (18 more...)

Neural Information Processing Systems

Country: North America > United States (0.46)

Industry:

Leisure & Entertainment > Games (0.68)
Education > Educational Setting (0.47)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)

Add feedback

Launch and Iterate: Reducing Prediction Churn

Mahdi Milani Fard, Quentin Cormier, Kevin Canini, Maya Gupta

Neural Information Processing SystemsJan-20-2025, 21:02:09 GMT

Practical applications of machine learning often involve successive training iterations with changes to features and training examples. Ideally, changes in the output of any new model should only be improvements (wins) over the previous iteration, but in practice the predictions may change neutrally for many examples, resulting in extra net-zero wins and losses, referred to as unnecessary churn. These changes in the predictions are problematic for usability for some applications, and make it harder and more expensive to measure if a change is statistically significant positive. In this paper, we formulate the problem and present a stabilization operator to regularize a classifier towards a previous classifier. We use a Markov chain Monte Carlo stabilization operator to produce a model with more consistent predictions without adversely affecting accuracy. We investigate the properties of the proposal with theoretical analysis. Experiments on benchmark datasets for different classification algorithms demonstrate the method and the resulting reduction in churn.

artificial intelligence, classifier, machine learning, (19 more...)

Neural Information Processing Systems

Country:

North America > United States (0.46)
Europe > France (0.28)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.87)

Add feedback

Fast and Flexible Monotonic Functions with Ensembles of Lattices

Mahdi Milani Fard, Kevin Canini, Andrew Cotter, Jan Pfeifer, Maya Gupta

Neural Information Processing SystemsJan-20-2025, 20:00:16 GMT

For many machine learning problems, there are some inputs that are known to be positively (or negatively) related to the output, and in such cases training the model to respect that monotonic relationship can provide regularization, and makes the model more interpretable. However, flexible monotonic functions are computationally challenging to learn beyond a few features. We break through this barrier by learning ensembles of monotonic calibrated interpolated look-up tables (lattices). A key contribution is an automated algorithm for selecting feature subsets for the ensemble base models. We demonstrate that compared to random forests, these ensembles produce similar or better accuracy, while providing guaranteed monotonicity consistent with prior knowledge, smaller model size and faster evaluation.

artificial intelligence, lattice, machine learning, (17 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Launch and Iterate: Reducing Prediction Churn

Mahdi Milani Fard, Quentin Cormier, Kevin Canini, Maya Gupta

Neural Information Processing SystemsJan-20-2025, 12:03:58 GMT

Practical applications of machine learning often involve successive training iterations with changes to features and training examples. Ideally, changes in the output of any new model should only be improvements (wins) over the previous iteration, but in practice the predictions may change neutrally for many examples, resulting in extra net-zero wins and losses, referred to as unnecessary churn. These changes in the predictions are problematic for usability for some applications, and make it harder and more expensive to measure if a change is statistically significant positive. In this paper, we formulate the problem and present a stabilization operator to regularize a classifier towards a previous classifier. We use a Markov chain Monte Carlo stabilization operator to produce a model with more consistent predictions without adversely affecting accuracy. We investigate the properties of the proposal with theoretical analysis. Experiments on benchmark datasets for different classification algorithms demonstrate the method and the resulting reduction in churn.

artificial intelligence, classifier, machine learning, (19 more...)

Neural Information Processing Systems

Country:

North America > United States (0.46)
Europe > France (0.28)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.87)

Add feedback

Diminishing Returns Shape Constraints for Interpretability and Regularization

Maya Gupta, Dara Bahri, Andrew Cotter, Kevin Canini

Neural Information Processing SystemsOct-8-2024, 03:58:23 GMT

We investigate machine learning models that can provide diminishing returns and accelerating returns guarantees to capture prior knowledge or policies about how outputs should depend on inputs. We show that one can build flexible, nonlinear, multi-dimensional models using lattice functions with any combination of concavity/convexity and monotonicity constraints on any subsets of features, and compare to new shape-constrained neural networks. We demonstrate on real-world examples that these shape constrained models can provide tuning-free regularization and improve model understandability.

artificial intelligence, constraint, machine learning, (14 more...)

Neural Information Processing Systems

Country: North America > United States (0.29)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback