AITopics | David Tse

A Minimax Approach to Supervised Learning

Neural Information Processing SystemsJun-2-2025, 02:46:21 GMT

Given a task of predicting Y from X, a loss function L, and a set of probability distributions Γ on (X, Y), what is the optimal decision rule minimizing the worstcase expected loss over Γ? In this paper, we address this question by introducing a generalization of the maximum entropy principle. Applying this principle to sets of distributions with marginal on X constrained to be the empirical marginal, we provide a minimax interpretation of the maximum likelihood problem over generalized linear models as well as some popular regularization schemes. For quadratic and logarithmic loss functions we revisit well-known linear and logistic regression models. Moreover, for the 0-1 loss we derive a classifier which we call the minimax SVM. The minimax SVM minimizes the worst-case expected 0-1 loss over the proposed Γ by solving a tractable optimization problem. We perform several numerical experiments to show the power of the minimax SVM in outperforming the SVM.

artificial intelligence, bayesian inference, machine learning, (17 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Santa Clara County (0.14)
Europe > Spain (0.14)

Genre: Research Report > New Finding (0.35)

Add feedback

NeuralFDR: Learning Discovery Thresholds from Hypothesis Features

Fei Xia, Martin J. Zhang, James Y. Zou, David Tse

Neural Information Processing SystemsMay-28-2025, 05:58:42 GMT

As datasets grow richer, an important challenge is to leverage the full features in the data to maximize the number of useful discoveries while controlling for false positives. We address this problem in the context of multiple hypotheses testing, where for each hypothesis, we observe a p-value along with a set of features specific to that hypothesis. For example, in genetic association studies, each hypothesis tests the correlation between a variant and the trait. We have a rich set of features for each variant (e.g. its location, conservation, epigenetics etc.) which could inform how likely the variant is to have a true association. However popular empirically-validated testing approaches, such as Benjamini-Hochberg's procedure (BH) and independent hypothesis weighting (IHW), either ignore these features or assume that the features are categorical or uni-variate. We propose a new algorithm, NeuralFDR, which automatically learns a discovery threshold as a function of all the hypothesis features. We parametrize the discovery threshold as a neural network, which enables flexible handling of multi-dimensional discrete and continuous features as well as efficient end-to-end optimization. We prove that NeuralFDR has strong false discovery rate (FDR) guarantees, and show that it makes substantially more discoveries in synthetic and real datasets. Moreover, we demonstrate that the learned discovery threshold is directly interpretable.

artificial intelligence, hypothesis, machine learning, (16 more...)

Neural Information Processing Systems

Country: North America > United States > California (0.14)

Genre: Research Report > Experimental Study (0.89)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.47)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.34)

Add feedback

Tensor Biclustering

Soheil Feizi, Hamid Javadi, David Tse

Neural Information Processing SystemsMay-28-2025, 05:32:03 GMT

Consider a dataset where data is collected on multiple features of multiple individuals over multiple times. This type of data can be represented as a three dimensional individual/feature/time tensor and has become increasingly prominent in various areas of science. The tensor biclustering problem computes a subset of individuals and a subset of features whose signal trajectories over time lie in a low-dimensional subspace, modeling similarity among the signal trajectories while allowing different scalings across different individuals or different features. We study the information-theoretic limit of this problem under a generative model. Moreover, we propose an efficient spectral algorithm to solve the tensor biclustering problem and analyze its achievability bound in an asymptotic regime. Finally, we show the efficiency of our proposed method in several synthetic and real datasets.

artificial intelligence, machine learning, tensor, (18 more...)

Neural Information Processing Systems

Country: North America > United States > California (0.14)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Add feedback

Porcupine Neural Networks: Approximating Neural Network Landscapes

Soheil Feizi, Hamid Javadi, Jesse Zhang, David Tse

Neural Information Processing SystemsMay-26-2025, 13:21:41 GMT

Neural networks have been used prominently in several machine learning and statistics applications. In general, the underlying optimization of neural networks is non-convex which makes analyzing their performance challenging. In this paper, we take another approach to this problem by constraining the network such that the corresponding optimization landscape has good theoretical properties without significantly compromising performance. In particular, for two-layer neural networks we introduce Porcupine Neural Networks (PNNs) whose weight vectors are constrained to lie over a finite set of lines. We show that most local optima of PNN optimizations are global while we have a characterization of regions where bad local optimizers may exist. Moreover, our theoretical and empirical results suggest that an unconstrained neural network can be approximated using a polynomially-large PNN.

artificial intelligence, machine learning, neural network, (15 more...)

Neural Information Processing Systems

Country: North America > United States > Maryland (0.14)

Genre: Research Report > New Finding (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

A Convex Duality Framework for GANs

Farzan Farnia, David Tse

Neural Information Processing SystemsMay-26-2025, 07:54:22 GMT

Generative adversarial network (GAN) is a minimax game between a generator mimicking the true model and a discriminator distinguishing the samples produced by the generator from the real training samples. Given an unconstrained discriminator able to approximate any function, this game reduces to finding the generative model minimizing a divergence score, e.g. the Jensen-Shannon (JS) divergence, to the data distribution. However, in practice the discriminator is constrained to be in a smaller class F such as convolutional neural nets. Then, a natural question is how the divergence minimization interpretation will change as we constrain F. In this work, we address this question by developing a convex duality framework for analyzing GAN minimax problems. For a convex set F, this duality framework interprets the original vanilla GAN problem as finding the generative model with the minimum JS-divergence to the distributions penalized to match the moments of the data distribution, with the moments specified by the discriminators in F. We show that this interpretation more generally holds for f-GAN and Wasserstein GAN. We further apply the convex duality framework to explain why regularizing the discriminator's Lipschitz constant, e.g.

artificial intelligence, discriminator, machine learning, (12 more...)

Neural Information Processing Systems

Country:

North America > United States (0.14)
North America > Canada (0.14)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Porcupine Neural Networks: Approximating Neural Network Landscapes

Soheil Feizi, Hamid Javadi, Jesse Zhang, David Tse

Neural Information Processing SystemsMar-26-2025, 21:55:08 GMT

Neural networks have been used prominently in several machine learning and statistics applications. In general, the underlying optimization of neural networks is non-convex which makes analyzing their performance challenging. In this paper, we take another approach to this problem by constraining the network such that the corresponding optimization landscape has good theoretical properties without significantly compromising performance. In particular, for two-layer neural networks we introduce Porcupine Neural Networks (PNNs) whose weight vectors are constrained to lie over a finite set of lines. We show that most local optima of PNN optimizations are global while we have a characterization of regions where bad local optimizers may exist. Moreover, our theoretical and empirical results suggest that an unconstrained neural network can be approximated using a polynomially-large PNN.

artificial intelligence, machine learning, neural network, (15 more...)

Neural Information Processing Systems

Country: North America > United States (0.28)

Genre: Research Report > New Finding (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

Ultra Fast Medoid Identification via Correlated Sequential Halving

Tavor Baharav, David Tse

Neural Information Processing SystemsMar-26-2025, 20:46:53 GMT

Neural Information Processing Systems http://nips.cc/

artificial intelligence, data mining, machine learning, (17 more...)

Neural Information Processing Systems

Country: North America > United States (0.46)

Industry: Health & Medicine (0.95)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Data Science > Data Mining > Big Data (0.48)

Add feedback

A Convex Duality Framework for GANs

Farzan Farnia, David Tse

Neural Information Processing SystemsMar-26-2025, 11:00:41 GMT

Generative adversarial network (GAN) is a minimax game between a generator mimicking the true model and a discriminator distinguishing the samples produced by the generator from the real training samples. Given an unconstrained discriminator able to approximate any function, this game reduces to finding the generative model minimizing a divergence score, e.g. the Jensen-Shannon (JS) divergence, to the data distribution. However, in practice the discriminator is constrained to be in a smaller class F such as convolutional neural nets. Then, a natural question is how the divergence minimization interpretation will change as we constrain F. In this work, we address this question by developing a convex duality framework for analyzing GAN minimax problems. For a convex set F, this duality framework interprets the original vanilla GAN problem as finding the generative model with the minimum JS-divergence to the distributions penalized to match the moments of the data distribution, with the moments specified by the discriminators in F. We show that this interpretation more generally holds for f-GAN and Wasserstein GAN. We further apply the convex duality framework to explain why regularizing the discriminator's Lipschitz constant, e.g.

artificial intelligence, discriminator, machine learning, (11 more...)

Neural Information Processing Systems

Country: North America (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Ultra Fast Medoid Identification via Correlated Sequential Halving

Tavor Baharav, David Tse

Neural Information Processing SystemsJan-26-2025, 23:39:44 GMT

The medoid of a set of n points is the point in the set that minimizes the sum of distances to other points.

artificial intelligence, data mining, machine learning, (17 more...)

Neural Information Processing Systems

Country: North America > United States > California > Santa Clara County (0.14)

Industry: Health & Medicine (0.95)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Data Science > Data Mining > Big Data (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Add feedback

NeuralFDR: Learning Discovery Thresholds from Hypothesis Features

Fei Xia, Martin J. Zhang, James Y. Zou, David Tse

Neural Information Processing SystemsOct-4-2024, 11:31:26 GMT

As datasets grow richer, an important challenge is to leverage the full features in the data to maximize the number of useful discoveries while controlling for false positives. We address this problem in the context of multiple hypotheses testing, where for each hypothesis, we observe a p-value along with a set of features specific to that hypothesis. For example, in genetic association studies, each hypothesis tests the correlation between a variant and the trait. We have a rich set of features for each variant (e.g. its location, conservation, epigenetics etc.) which could inform how likely the variant is to have a true association. However popular empirically-validated testing approaches, such as Benjamini-Hochberg's procedure (BH) and independent hypothesis weighting (IHW), either ignore these features or assume that the features are categorical or uni-variate. We propose a new algorithm, NeuralFDR, which automatically learns a discovery threshold as a function of all the hypothesis features. We parametrize the discovery threshold as a neural network, which enables flexible handling of multi-dimensional discrete and continuous features as well as efficient end-to-end optimization. We prove that NeuralFDR has strong false discovery rate (FDR) guarantees, and show that it makes substantially more discoveries in synthetic and real datasets. Moreover, we demonstrate that the learned discovery threshold is directly interpretable.

artificial intelligence, hypothesis, machine learning, (16 more...)

Neural Information Processing Systems

Country: North America > United States (0.28)

Genre: Research Report > Experimental Study (0.89)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.47)

Technology: