Goto

Collaborating Authors

 Information Technology


Infinite-Horizon Gaussian Processes

Neural Information Processing Systems

Gaussian processes provide a flexible framework for forecasting, removing noise, and interpreting long temporal datasets. State space modelling (Kalman filtering) enables these non-parametric models to be deployed on long datasets by reducing the complexity to linear in the number of data points. The complexity is still cubic in the state dimension m which is an impediment to practical application. In certain special cases (Gaussian likelihood, regular spacing) the GP posterior will reach a steady posterior state when the data are very long. We leverage this and formulate an inference scheme for GPs with general likelihoods, where inference is based on single-sweep EP (assumed density filtering). The infinite-horizon model tackles the cubic cost in the state dimensionality and reduces the cost in the state dimension m to O(m^2) per data point. The model is extended to online-learning of hyperparameters. We show examples for large finite-length modelling problems, and present how the method runs in real-time on a smartphone on a continuous data stream updated at 100 Hz.


Here are all the moments you didn't see on TV

BBC News

Oscars 2026: Here are all the moments you didn't see on TV The 98th Academy Awards featured emotional speeches, comical relief and a bevy of backstage fun. While movie magic plays a role in the show itself (the ceremony, after all, is actually hosted at the Dolby Theatre in a shopping centre), there is a lot you don't see on TV. Frankenstein production designer addressed the media with his Oscar statuette in one hand and what appeared to be a beer in the other and Mr Nobody Against Putin filmmaker Pasha Talankin re-lived his Oscars win by re-reading the envelope that announced that his movie won the award for documentary feature film. We saw some of the tightest security in recent years and witnessed the frenzied panic after one Oscar award became two when those vying for best short action film was announced as a historic tie. Here's what it's like on the scene during Hollywood's biggest night and everything you did not see on TV.

  Genre: Personal > Honors (0.69)
  Industry:
  Technology: Information Technology > Artificial Intelligence (0.35)

Trump accuses Iran of using AI to spread disinformation

The Japan Times

U.S. President Donald Trump speaks to reporters aboard Air Force One on a flight to Washington on Sunday. SAN FRANCISCO - U.S. President Donald Trump on Sunday accused Iran of using artificial intelligence as a "disinformation weapon" to misrepresent its wartime successes and support. "AI can be very dangerous, we have to be very careful with it," Trump said to reporters on Air Force One shortly after he made a post on his Truth Social platform where he accused Western media outlets without evidence of "close coordination" with Iran to spread AI-generated fake news." The comments come amid renewed tensions between the Federal Communications Commission and broadcasters after Trump took aim at media coverage of the U.S. and Israel's war with Iran. FCC Chairman Brendan Carr on Saturday threatened to pull licenses of broadcasters who did not "correct course" on their coverage.


Bounded-Loss Private Prediction Markets

Neural Information Processing Systems

Prior work has investigated variations of prediction markets that preserve participants' (differential) privacy, which formed the basis of useful mechanisms for purchasing data for machine learning objectives. Such markets required potentially unlimited financial subsidy, however, making them impractical. In this work, we design an adaptively-growing prediction market with a bounded financial subsidy, while achieving privacy, incentives to produce accurate predictions, and precision in the sense that market prices are not heavily impacted by the added privacy-preserving noise. We briefly discuss how our mechanism can extend to the data-purchasing setting, and its relationship to traditional learning algorithms.


Relating Leverage Scores and Density using Regularized Christoffel Functions

Neural Information Processing Systems

Statistical leverage scores emerged as a fundamental tool for matrix sketching and column sampling with applications to low rank approximation, regression, random feature learning and quadrature. Yet, the very nature of this quantity is barely understood. Borrowing ideas from the orthogonal polynomial literature, we introduce the regularized Christoffel function associated to a positive definite kernel. This uncovers a variational formulation for leverage scores for kernel methods and allows to elucidate their relationships with the chosen kernel as well as population density. Our main result quantitatively describes a decreasing relation between leverage score and population density for a broad class of kernels on Euclidean spaces. Numerical simulations support our findings.


Constant Regret, Generalized Mixability, and Mirror Descent

Neural Information Processing Systems

We consider the setting of prediction with expert advice; a learner makes predictions by aggregating those of a group of experts. Under this setting, and for the right choice of loss function and ``mixing'' algorithm, it is possible for the learner to achieve a constant regret regardless of the number of prediction rounds. For example, a constant regret can be achieved for \emph{mixable} losses using the \emph{aggregating algorithm}. The \emph{Generalized Aggregating Algorithm} (GAA) is a name for a family of algorithms parameterized by convex functions on simplices (entropies), which reduce to the aggregating algorithm when using the \emph{Shannon entropy} $\operatorname{S}$. For a given entropy $\Phi$, losses for which a constant regret is possible using the \textsc{GAA} are called $\Phi$-mixable. Which losses are $\Phi$-mixable was previously left as an open question. We fully characterize $\Phi$-mixability and answer other open questions posed by \cite{Reid2015}. We show that the Shannon entropy $\operatorname{S}$ is fundamental in nature when it comes to mixability; any $\Phi$-mixable loss is necessarily $\operatorname{S}$-mixable, and the lowest worst-case regret of the \textsc{GAA} is achieved using the Shannon entropy. Finally, by leveraging the connection between the \emph{mirror descent algorithm} and the update step of the GAA, we suggest a new \emph{adaptive} generalized aggregating algorithm and analyze its performance in terms of the regret bound.


Fire erupts at Dubai airport following drone attack

Al Jazeera

Could Iran be using China's BeiDou system? Footage shows a fire burning near Dubai International Airport after a drone ignited a fuel tank, according to authorities in the UAE. Civil defence crews say the blaze is under control. What is force majeure and why are some Gulf countries invoking it?


Memory Replay GANs: Learning to Generate New Categories without Forgetting

Neural Information Processing Systems

In this paper we consider the case of generative models. In particular, we investigate generative adversarial networks (GANs) in the task of learning new categories in a sequential fashion. We first show that sequential fine tuning renders the network unable to properly generate images from previous categories (i.e.


GroupReduce: Block-Wise Low-Rank Approximation for Neural Language Model Shrinking

Neural Information Processing Systems

Model compression is essential for serving large deep neural nets on devices with limited resources or applications that require real-time responses. For advanced NLP problems, a neural language model usually consists of recurrent layers (e.g., using LSTM cells), an embedding matrix for representing input tokens, and a softmax layer for generating output tokens. For problems with a very large vocabulary size, the embedding and the softmax matrices can account for more than half of the model size. For instance, the bigLSTM model achieves state-of-the-art performance on the One-Billion-Word (OBW) dataset with around 800k vocabulary, and its word embedding and softmax matrices use more than 6GBytes space, and are responsible for over 90\% of the model parameters. In this paper, we propose GroupReduce, a novel compression method for neural language models, based on vocabulary-partition (block) based low-rank matrix approximation and the inherent frequency distribution of tokens (the power-law distribution of words). We start by grouping words into $c$ blocks based on their frequency, and then refine the clustering iteratively by constructing weighted low-rank approximation for each block, where the weights are based the frequencies of the words in the block. The experimental results show our method can significantly outperform traditional compression methods such as low-rank approximation and pruning. On the OBW dataset, our method achieved 6.6x compression rate for the embedding and softmax matrices, and when combined with quantization, our method can achieve 26x compression rate without losing prediction accuracy.


Semi-supervised Deep Kernel Learning: Regression with Unlabeled Data by Minimizing Predictive Variance

Neural Information Processing Systems

Large amounts of labeled data are typically required to train deep learning models. For many real-world problems, however, acquiring additional data can be expensive or even impossible. We present semi-supervised deep kernel learning (SSDKL), a semi-supervised regression model based on minimizing predictive variance in the posterior regularization framework. SSDKL combines the hierarchical representation learning of neural networks with the probabilistic modeling capabilities of Gaussian processes. By leveraging unlabeled data, we show improvements on a diverse set of real-world regression tasks over supervised deep kernel learning and semi-supervised methods such as VAT and mean teacher adapted for regression.