AITopics | Statistical Learning

Collaborating Authors

Statistical Learning

News Overviews Instructional Materials AI-Alerts Classics

Coresets for Scalable Bayesian Logistic Regression

Jonathan Huggins, Trevor Campbell, Tamara Broderick

Neural Information Processing SystemsMar-23-2026, 04:37:36 GMT

The use of Bayesian methods in large-scale data settings is attractive because of the rich hierarchical models, uncertainty quantification, and prior specification they provide. Standard Bayesian inference algorithms are computationally expensive, however, making their direct application to large datasets difficult or infeasible. Recent work on scaling Bayesian inference has focused on modifying the underlying algorithms to, for example, use only a random data subsample at each iteration. We leverage the insight that data is often redundant to instead obtain a weighted subset of the data (called a coreset) that is much smaller than the original dataset. We can then use this small coreset in any number of existing posterior inference algorithms without modification.

artificial intelligence, bayesian inference, machine learning, (15 more...)

Neural Information Processing Systems

Genre: Research Report > New Finding (0.85)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

The Sound of APALM Clapping: Faster Nonsmooth Nonconvex Optimization with Stochastic Asynchronous PALM

Damek Davis, Brent Edmunds, Madeleine Udell

Neural Information Processing SystemsMar-23-2026, 04:37:18 GMT

Neural Information Processing Systems http://nips.cc/

artificial intelligence, machine learning, processor, (15 more...)

Neural Information Processing Systems

Country:

North America > United States > California (0.28)
Europe (0.28)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.49)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.47)

Add feedback

Stochastic Optimization for Large-scale Optimal Transport

Aude Genevay, Marco Cuturi, Gabriel Peyré, Francis Bach

Neural Information Processing SystemsMar-23-2026, 04:36:59 GMT

Optimal transport (OT) defines a powerful framework to compare probability distributions in a geometrically faithful way. However, the practical impact of OT is still limited because of its computational burden. We propose a new class of stochastic optimization algorithms to cope with large-scale OT problems. These methods can handle arbitrary distributions (either discrete or continuous) as long as one is able to draw samples from them, which is the typical setup in highdimensional learning problems.

algorithm, artificial intelligence, machine learning, (17 more...)

Neural Information Processing Systems

Country: Europe (0.46)

Genre: Research Report (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Spectral Learning of Dynamic Systems from Nonequilibrium Data

Hao Wu, Frank Noe

Neural Information Processing SystemsMar-23-2026, 04:35:45 GMT

Observable operator models (OOMs) and related models are one of the most important and powerful tools for modeling and analyzing stochastic systems. They exactly describe dynamics of finite-rank systems and can be efficiently and consistently estimated through spectral learning under the assumption of identically distributed data. In this paper, we investigate the properties of spectral learning without this assumption due to the requirements of analyzing large-time scale systems, and show that the equilibrium dynamics of a system can be extracted from nonequilibrium observation data by imposing an equilibrium constraint. In addition, we propose a binless extension of spectral learning for continuous data. In comparison with the other continuous-valued spectral algorithms, the binless algorithm can achieve consistent estimation of equilibrium dynamics with only linear complexity.

algorithm, artificial intelligence, machine learning, (15 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.73)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)

Add feedback

Proximal Stochastic Methods for Nonsmooth Nonconvex Finite-Sum Optimization

Sashank J. Reddi, Suvrit Sra, Barnabas Poczos, Alexander J. Smola

Neural Information Processing SystemsMar-23-2026, 04:35:26 GMT

We analyze stochastic algorithms for optimizing nonconvex, nonsmooth finite-sum problems, where the nonsmooth part is convex. Surprisingly, unlike the smooth case, our knowledge of this fundamental problem is very limited. For example, it is not known whether the proximal stochastic gradient method with constant minibatch converges to a stationary point. To tackle this issue, we develop fast stochastic algorithms that provably converge to a stationary point for constant minibatches. Furthermore, using a variant of these algorithms, we obtain provably faster convergence than batch proximal gradient descent. Our results are based on the recent variance reduction techniques for convex optimization but with a novel analysis for handling nonconvex and nonsmooth functions. We also prove global linear convergence rate for an interesting subclass of nonsmooth nonconvex functions, which subsumes several recent works.

algorithm, artificial intelligence, machine learning, (15 more...)

Neural Information Processing Systems

Country: North America > United States > New York (0.14)

Genre: Research Report > New Finding (0.49)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.57)

Add feedback

Cyclades: Conflict-free Asynchronous Machine Learning

Xinghao Pan, Maximilian Lam, Stephen Tu, Dimitris Papailiopoulos, Ce Zhang, Michael I. Jordan, Kannan Ramchandran, Christopher Ré

Neural Information Processing SystemsMar-23-2026, 04:35:09 GMT

We present CYCLADES, a general framework for parallelizing stochastic optimization algorithms in a shared memory setting. CYCLADES is asynchronous during model updates, and requires no memory locking mechanisms, similar to HOGWILD!-type algorithms. Unlike HOGWILD!, CYCLADES introduces no conflicts during parallel execution, and offers a black-box analysis for provable speedups across a large family of algorithms. Due to its inherent cache locality and conflictfree nature, our multi-core implementation of CYCLADES consistently outperforms HOGWILD!-type algorithms on sufficiently sparse datasets, leading to up to 40% speedup gains compared to HOGWILD!, and up to 5 gains over asynchronous implementations of variance reduction algorithms.

artificial intelligence, hogwild, machine learning, (19 more...)

Neural Information Processing Systems

Genre: Research Report > New Finding (0.47)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Feature selection in functional data classification with recursive maxima hunting

José L. Torrecilla, Alberto Suárez

Neural Information Processing SystemsMar-23-2026, 04:08:52 GMT

Dimensionality reduction is one of the key issues in the design of effective machine learning methods for automatic induction. In this work, we introduce recursive maxima hunting (RMH) for variable selection in classification problems with functional data. In this context, variable selection techniques are especially attractive because they reduce the dimensionality, facilitate the interpretation and can improve the accuracy of the predictive models. The method, which is a recursive extension of maxima hunting (MH), performs variable selection by identifying the maxima of a relevance function, which measures the strength of the correlation of the predictor functional variable with the class label. At each stage, the information associated with the selected variable is removed by subtracting the conditional expectation of the process. The results of an extensive empirical evaluation are used to illustrate that, in the problems investigated, RMH has comparable or higher predictive accuracy than standard dimensionality reduction techniques, such as PCA and PLS, and state-of-the-art feature selection methods for functional data, such as maxima hunting.

artificial intelligence, classification, machine learning, (17 more...)

Neural Information Processing Systems

Country: Europe > Spain (0.29)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.34)

Add feedback

Graph Clustering: Block-models and model free results

Yali Wan, Marina Meila

Neural Information Processing SystemsMar-23-2026, 04:08:15 GMT

Clustering graphs under the Stochastic Block Model (SBM) and extensions are well studied. Guarantees of correctness exist under the assumption that the data is sampled from a model. In this paper, we propose a framework, in which we obtain "correctness" guarantees without assuming the data comes from a model. The guarantees we obtain depend instead on the statistics of the data that can be checked. We also show that this framework ties in with the existing model-based framework, and that we can exploit results in model-based recovery, as well as strengthen the results existing in that area of research.

artificial intelligence, machine learning, sbm, (18 more...)

Neural Information Processing Systems

Country: North America > United States > Washington > King County > Seattle (0.14)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.83)

Add feedback

Dynamic Network Surgery for Efficient DNNs

Yiwen Guo, Anbang Yao, Yurong Chen

Neural Information Processing SystemsMar-23-2026, 04:08:01 GMT

Deep learning has become a ubiquitous technology to improve machine intelligence. However, most of the existing deep models are structurally very complex, making them difficult to be deployed on the mobile platforms with limited computational power. In this paper, we propose a novel network compression method called dynamic network surgery, which can remarkably reduce the network complexity by making on-the-fly connection pruning. Unlike the previous methods which accomplish this task in a greedy way, we properly incorporate connection splicing into the whole process to avoid incorrect pruning and make it as a continual network maintenance. The effectiveness of our method is proved with experiments. Without any accuracy loss, our method can efficiently compress the number of parameters in LeNet-5 and AlexNet by a factor of 108 and 17.7 respectively, proving that it outperforms the recent pruning method by considerable margins.

artificial intelligence, machine learning, neural network, (18 more...)

Neural Information Processing Systems

Technology:

Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)

Add feedback

Greedy Feature Construction

Neural Information Processing SystemsMar-23-2026, 04:07:42 GMT

We present an effective method for supervised feature construction. The main goal of the approach is to construct a feature representation for which a set of linear hypotheses is of sufficient capacity - large enough to contain a satisfactory solution to the considered problem and small enough to allow good generalization from a small number of training examples. We achieve this goal with a greedy procedure that constructs features by empirically fitting squared error residuals. The proposed constructive procedure is consistent and can output a rich set of features. The effectiveness of the approach is evaluated empirically by fitting a linear ridge regression model in the constructed feature space and our empirical results indicate a superior performance of our approach over competing methods.

artificial intelligence, machine learning, sequence, (18 more...)

Neural Information Processing Systems

Country: Europe > United Kingdom > England (0.28)

Genre: Research Report (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.35)

Add feedback