Goto

Collaborating Authors

Computational Efficient Informative Nonignorable Matrix Completion: A Row- and Column-Wise Matrix U-Statistic Pseudo-Likelihood Approach

arXiv.org Machine Learning

In this study, we establish a unified framework to deal with the high dimensional matrix completion problem under flexible nonignorable missing mechanisms. Although the matrix completion problem has attracted much attention over the years, there are very sparse works that consider the nonignorable missing mechanism. To address this problem, we derive a row- and column-wise matrix U-statistics type loss function, with the nuclear norm for regularization. A singular value proximal gradient algorithm is developed to solve the proposed optimization problem. We prove the non-asymptotic upper bound of the estimation error's Frobenius norm and show the performance of our method through numerical simulations and real data analysis.


Adaptive sparse variational approximations for Gaussian process regression

arXiv.org Machine Learning

Department of Decision Sciences, Bocconi Institute for Data Science and Analytics, Bocconi University, Milan Abstract Accurate tuning of hyperparameters is crucial to ensure that models can generalise effectively across different settings. We construct a variational approximation to a hierarchical Bayes procedure, and derive upper bounds for the contraction rate of the variational posterior in an abstract setting. The theory is applied to various Gaussian process priors and variational classes, resulting in minimax optimal rates. Our theoretical results are accompanied with numerical analysis both on synthetic and real world data sets. Keywords: variational inference, Bayesian model selection, Gaussian processes, nonparametric regression, adaptation, posterior contraction rates 1 Introduction A core challenge in Bayesian statistics is scalability, i.e. the computation of the posterior for large sample sizes. Variational Bayes approximation is a standard approach to speed up inference. Variational posteriors are random probability measures that minimise the Kullback-Leibler divergence between a suitable class of distributions and the otherwise hard to compute posterior. Typically, the variational class of distributions over which the optimisation takes place does not contain the original posterior, hence the variational procedure can be viewed as a projection onto this class. The projected variational distribution then approximates the posterior. During the approximation procedure one inevitably loses information and hence it is important to characterize the accuracy of the approach. Despite the wide use of variational approximations, their theoretical underpinning started to emerge only recently, see for instance Alquier and Ridgway (2020); Yang et al. (2020); Zhang and Gao (2020a); Ray and Szab o (2022). In a Bayesian procedure, the choice of prior reflects the presumed properties of the unknown parameter. In comparison to regular parametric models, where in view of the Bernstein-von Mises theorem the posterior is asymptotically normal, the prior plays a crucial role in the asymptotic behaviour of the posterior. In fact, the large-sample behaviour of the posterior typically depends intricately on the choice of prior hyperparam-eters, so it is vital that these are tuned correctly. The two classical approaches are hierarchical and empirical Bayes methods.


A New Approach to Controlling Linear Dynamical Systems

arXiv.org Machine Learning

We propose a new method for controlling linear dynamical systems under adversarial disturbances and cost functions. Our algorithm achieves a running time that scales polylogarithmically with the inverse of the stability margin, improving upon prior methods with polynomial dependence maintaining the same regret guarantees. The technique, which may be of independent interest, is based on a novel convex relaxation that approximates linear control policies using spectral filters constructed from the eigenvectors of a specific Hankel matrix.


Operator Learning: A Statistical Perspective

arXiv.org Machine Learning

Operator learning has emerged as a powerful tool in scientific computing for approximating mappings between infinite-dimensional function spaces. A primary application of operator learning is the development of surrogate models for the solution operators of partial differential equations (PDEs). These methods can also be used to develop black-box simulators to model system behavior from experimental data, even without a known mathematical model. In this article, we begin by formalizing operator learning as a function-to-function regression problem and review some recent developments in the field. We also discuss PDE-specific operator learning, outlining strategies for incorporating physical and mathematical constraints into architecture design and training processes. Finally, we end by highlighting key future directions such as active data collection and the development of rigorous uncertainty quantification frameworks.


ThinkEdit: Interpretable Weight Editing to Mitigate Overly Short Thinking in Reasoning Models

arXiv.org Artificial Intelligence

Recent studies have shown that Large Language Models (LLMs) augmented with chain-of-thought (CoT) reasoning demonstrate impressive problem-solving abilities. However, in this work, we identify a recurring issue where these models occasionally generate overly short reasoning, leading to degraded performance on even simple mathematical problems. Specifically, we investigate how reasoning length is embedded in the hidden representations of reasoning models and its impact on accuracy. Our analysis reveals that reasoning length is governed by a linear direction in the representation space, allowing us to induce overly short reasoning by steering the model along this direction. Building on this insight, we introduce ThinkEdit, a simple yet effective weight-editing approach to mitigate the issue of overly short reasoning. We first identify a small subset of attention heads (approximately 2%) that predominantly drive short reasoning behavior. We then edit the output projection weights of these heads to suppress the short reasoning direction. With changes to only 0.1% of the model's parameters, ThinkEdit effectively reduces overly short reasoning and yields notable accuracy gains for short reasoning outputs (+5.44%), along with an overall improvement across multiple math benchmarks (+2.43%). Our findings provide new mechanistic insights into how reasoning length is controlled within LLMs and highlight the potential of fine-grained model interventions to improve reasoning quality. Our code is available at https://github.com/Trustworthy-ML-Lab/ThinkEdit


Random Normed k-Means: A Paradigm-Shift in Clustering within Probabilistic Metric Spaces

arXiv.org Machine Learning

Existing approaches remain largely constrained by traditional distance metrics, limiting their effectiveness in handling random data. In this work, we introduce the first k-means variant in the literature that operates within a probabilistic metric space, replacing conventional distance measures with a well-defined distance distribution function. This pioneering approach enables more flexible and robust clustering in both deterministic and random datasets, establishing a new foundation for clustering in stochastic environments. By adopting a probabilistic perspective, our method not only introduces a fresh paradigm but also establishes a rigorous theoretical framework that is expected to serve as a key reference for future clustering research involving random data. Extensive experiments on diverse real and synthetic datasets assess our model's effectiveness using widely recognized evaluation metrics, including Silhouette, Davies-Bouldin, Calinski Harabasz, the adjusted Rand index, and distortion. Comparative analyses against established methods such as k-means++, fuzzy c-means, and kernel probabilistic k-means demonstrate the superior performance of our proposed random normed k-means (RNKM) algorithm. Notably, RNKM exhibits a remarkable ability to identify nonlinearly separable structures, making it highly effective in complex clustering scenarios. These findings position RNKM as a groundbreaking advancement in clustering research, offering a powerful alternative to traditional techniques while addressing a long-standing gap in the literature. By bridging probabilistic metrics with clustering, this study provides a foundational reference for future developments and opens new avenues for advanced data analysis in dynamic, data-driven applications.


'MythBusters' star Adam Savage explores longevity and life hacks: 'There's no magic secret'

FOX News

Tested's Adam Savage paired up with Medtronic to offer his commentary on what can contribute to a longer lifespan, including possible differences between men's and women's health. Former "MythBusters" star Adam Savage is exploring the science of longevity, asking how lifestyle choices, stress and even sleep affect how long we live. Savage, now a YouTube creator and head of the channel Tested, has partnered with health technology company Medtronic to engage in discussions about longevity. While not a researcher himself, he has taken a deep dive into scientific insights from experts and reflected on his own experiences. "Longevity has always been a fascination for me," Savage told Fox News Digital in an exclusive interview.


PICTURED: New images show the gruesome effect microplastics have on your body

Daily Mail - Science & tech

Gruesome pictures have revealed the shocking impact microplastics could be having on your appearance -- and making you look decrepit and older. Microplastics are now in almost everything we touch, from food and clothing to water, kitchenware and household items - and every American is now thought to have microplastics in their bodies. Now, a UK recycling company has tried to capture the impact these toxins could be having on the skin. In a release, they used AI to estimate how long-term exposure to microplastics at low, medium and high levels could impact a man and a woman's appearance. Mark Hall, a plastic waste expert at the business behind the report, said: 'It's clear to see there are many worrying signs of how this pollution might affect us.


M3GAN 2.0 trailer: Now theres two of them!

Mashable

In 2023, M3GAN wowed audiences with campy scares, viral dance moves, and an unforgettable diva doll. Now, in the upcoming sequel M3GAN 2.0, we don't just get one killer AI, we get two! M3GAN 2.0's trailer introduces Amelia (Ivanna Sakhno), a military-grade weapon created using the technology that made -- nay, gifted -- us M3GAN (Amie Donald, voiced by Jenna Davis). As her self-awareness increases, her desire to kill multiplies until she's a threat to all of humanity. Now, she has M3GAN's creator Gemma (Allison Williams) and niece (Violet McGraw) in her sights.


How Bill Gates, the Altair 8800 and BASIC propelled me into the PC revolution

ZDNet

Have I told you the story about Bill Gates and me in those early days of personal computing? To be clear: Bill Gates is older than I am. In 1975, as Bill was leaving Harvard to start Microsoft, I had just skipped my last year of high school and started college. I was the youngest student in engineering school that first year -- the same year Bill and I were using the same computer technology: the Altair 8800 and the Digital Equipment PDP-10. My high school computing experience -- like Bill's -- was formative.