Goto

Collaborating Authors

Bregman Divergence for Stochastic Variance Reduction: Saddle-Point and Adversarial Prediction

Neural Information Processing Systems

Adversarial machines, where a learner competes against an adversary, have regained much recent interest in machine learning. They are naturally in the form of saddle-point optimization, often with separable structure but sometimes also with unmanageably large dimension. In this work we show that adversarial prediction under multivariate losses can be solved much faster than they used to be. We first reduce the problem size exponentially by using appropriate sufficient statistics, and then we adapt the new stochastic variance-reduced algorithm of Balamurugan & Bach (2016) to allow any Bregman divergence. We prove that the same linear rate of convergence is retained and we show that for adversarial prediction using KL-divergence we can further achieve a speedup of #example times compared with the Euclidean alternative. We verify the theoretical findings through extensive experiments on two example applications: adversarial prediction and LPboosting.


An Empirical Bayes Approach to Optimizing Machine Learning Algorithms

Neural Information Processing Systems

There is rapidly growing interest in using Bayesian optimization to tune model and inference hyperparameters for machine learning algorithms that take a long time to run. For example, Spearmint is a popular software package for selecting the optimal number of layers and learning rate in neural networks. But given that there is uncertainty about which hyperparameters give the best predictive performance, and given that fitting a model for each choice of hyperparameters is costly, it is arguably wasteful to throw away all but the best result, as per Bayesian optimization. A related issue is the danger of overfitting the validation data when optimizing many hyperparameters. In this paper, we consider an alternative approach that uses more samples from the hyperparameter selection procedure to average over the uncertainty in model hyperparameters. The resulting approach, empirical Bayes for hyperparameter averaging (EB-Hyp) predicts held-out data better than Bayesian optimization in two experiments on latent Dirichlet allocation and deep latent Gaussian models. EB-Hyp suggests a simpler approach to evaluating and deploying machine learning algorithms that does not require a separate validation data set and hyperparameter selection procedure.


From which world is your graph

Neural Information Processing Systems

Discovering statistical structure from links is a fundamental problem in the analysis of social networks. Choosing a misspecified model, or equivalently, an incorrect inference algorithm will result in an invalid analysis or even falsely uncover patterns that are in fact artifacts of the model. This work focuses on unifying two of the most widely used link-formation models: the stochastic block model (SBM) and the small world (or latent space) model (SWM). Integrating techniques from kernel learning, spectral graph theory, and nonlinear dimensionality reduction, we develop the first statistically sound polynomial-time algorithm to discover latent patterns in sparse graphs for both models. When the network comes from an SBM, the algorithm outputs a block structure. When it is from an SWM, the algorithm outputs estimates of each node's latent position.


A meteor exploded over Ohio and Pennsylvania

Popular Science

A very loud bang accompanied the disintegrating space rock. Although loud, little of the meteor is expected to have survived the atmospheric entry. Breakthroughs, discoveries, and DIY tips sent six days a week. Residents across northeastern Ohio received a rude--or at least extremely unexpected--wake-up call this morning. According to the National Weather Service (NWS), the loud boom experienced across the region around 9 a.m. EDT on March 17 was most likely the result of a meteor disintegrating as it sped through Earth's atmosphere.


Watch: Iranians show daily life under air strikes and regime crackdown

BBC News

The BBC has obtained footage and interviews from the Iranian capital Tehran which evoke a city of strained nerves, of constant waiting for the next air strike and relentless fear of the state security apparatus. The identities of the people in this report have been protected. While independent journalists still try to gather testimony that offers a credible alternative view, they run the risk of arrest, torture and possibly worse. Displaced Palestinians were told to secure their tents to prevent them being blown away as a storm swept through the enclave. Video filmed by a witness and verified by the BBC shows a drone crashing close to the airport.


GPT-5.4 mini brings some of the smarts of OpenAI's latest model to ChatGPT Free and Go users

Engadget

GPT-5.4 mini brings some of the smarts of OpenAI's latest model to ChatGPT Free and Go users The new model offers performance improvements in reasoning, multimodal understanding and more. The ChatGPT icon, as seen on iPhone 12 running iOS. When OpenAI released GPT-5.4 at the start of March, the company said the new model was designed primarily for professional work like programming and data analysis. Now OpenAI is launching GPT-5.4 mini and nano, and while it is once again highlighting the usefulness of these new systems for tasks like coding, one of the new models is available to Free and Go users . What's more, that model, GPT-5.4 mini, even offers performance that approaches GPT-5.4 in a handful of areas.


Dyson's New PencilWash Is Here

WIRED

Dyson's Newest Wet Floor Cleaner Is Available as of Today The debut follows the release of Dyson's newest robot vacuum and larger wet cleaner last week. Welcome to a new world of mopping options from Dyson. After announcing several new models last year at IFA Berlin, Dyson has begun rolling out its latest suite of vacuums and wet floor cleaners to the public. Last week, Dyson's newest robot vacuum, the Spot+Scrub Ai ($1,200), became available for purchase online, along with the Clean+Wash Hygiene ($500), one of the brand's new wet floor cleaners. The recently announced Dyson PencilWash ($350) is available as of today.


Multi-Objective Non-parametric Sequential Prediction

Neural Information Processing Systems

Online-learning research has mainly been focusing on minimizing one objective function. In many real-world applications, however, several objective functions have to be considered simultaneously.


Gaussian process based nonlinear latent structure discovery in multivariate spike train data

Neural Information Processing Systems

A large body of recent work focuses on methods for extracting low-dimensional latent structure from multi-neuron spike train data. Most such methods employ either linear latent dynamics or linear mappings from latent space to log spike rates. Here we propose a doubly nonlinear latent variable model that can identify low-dimensional structure underlying apparently high-dimensional spike train data. We introduce the Poisson Gaussian-Process Latent Variable Model (P-GPLVM), which consists of Poisson spiking observations and two underlying Gaussian processes--one governing a temporal latent variable and another governing a set of nonlinear tuning curves. The use of nonlinear tuning curves enables discovery of low-dimensional latent structure even when spike responses exhibit high linear dimensionality (e.g., as found in hippocampal place cell codes). To learn the model from data, we introduce the decoupled Laplace approximation, a fast approximate inference method that allows us to efficiently optimize the latent path while marginalizing over tuning curves. We show that this method outperforms previous Laplace-approximation-based inference methods in both the speed of convergence and accuracy. We apply the model to spike trains recorded from hippocampal place cells and show that it compares favorably to a variety of previous methods for latent structure discovery, including variational auto-encoder (VAE) based methods that parametrize the nonlinear mapping from latent space to spike rates with a deep neural network.


Tomography of the London Underground: a Scalable Model for Origin-Destination Data

Neural Information Processing Systems

The paper addresses the classical network tomography problem of inferring local traffic given origin-destination observations. Focussing on large complex public transportation systems, we build a scalable model that exploits input-output information to estimate the unobserved link/station loads and the users path preferences. Based on the reconstruction of the users' travel time distribution, the model is flexible enough to capture possible different path-choice strategies and correlations between users travelling on similar paths at similar times. The corresponding likelihood function is intractable for medium or large-scale networks and we propose two distinct strategies, namely the exact maximum-likelihood inference of an approximate but tractable model and the variational inference of the original intractable model. As an application of our approach, we consider the emblematic case of the London Underground network, where a tap-in/tap-out system tracks the start/exit time and location of all journeys in a day. A set of synthetic simulations and real data provided by Transport For London are used to validate and test the model on the predictions of observable and unobservable quantities.