AITopics | initializer

Collaborating Authors

initializer

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Low Rank for Rank: Uncertainty-Aware Task-Specific LLM Ranking under Sparse Pairwise Comparisons

Li, Jiachun, Simchi-Levi, David, Sun, Will Wei

arXiv.org Machine LearningMay-29-2026

Pairwise human-preference platforms such as Chatbot Arena have become central to large language model (LLM) evaluation, yet reliable task-specific ranking remains challenging. Global leaderboards mask task heterogeneity, while ranking each fine-grained task independently is unstable under sparse, imbalanced comparisons. We propose a low-rank framework for task-specific LLM ranking from sparse pairwise comparisons, modeling the task-by-model ability matrix $Θ^\star \in \mathbb{R}^{d_t \times d_m}$ as low rank so that information is shared across related tasks while task-specific differences are preserved. We first develop a max-norm ($\ell_\infty$) accurate estimator for the latent scores, combining a convex initializer with alternating-minimization refinement, and prove task-wise top-$K$ recovery guarantees under sparse sampling. Our main contribution is an uncertainty quantification framework for task-specific ranking. We construct cross-fitted one-step debiased estimators for fixed score contrasts -- such as the task-specific ability gap between two models -- yielding asymptotically valid confidence intervals that attain the semiparametric efficiency bound. We then extend the inference to the high-dimensional ranking regime, where per-task ranks and top-$K$ membership are determined by many dependent score-gap hypotheses. Using Gaussian and multiplier-bootstrap calibration, we obtain simultaneous confidence sets for per-task ranks and valid top-$K$ membership tests across many tasks and models. Experiments on synthetic data and Chatbot Arena show that low-rank sharing improves sample efficiency over independent task-wise Bradley-Terry estimation and produces tighter, better-calibrated ranking certificates, with the largest gains in the sparse regime typical of real LLM benchmarks.

artificial intelligence, large language model, natural language, (18 more...)

arXiv.org Machine Learning

2605.29395

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Hierarchical Probabilistic Principal Component Analysis of Longitudinal Data

Zhang, Xinyu, Qaqish, Ameer, Lin, D. Y., Li, Didong

arXiv.org Machine LearningApr-27-2026

In many longitudinal studies, a large number of variables are measured repeatedly over time, with substantial missing data. Existing methods, such as probabilistic principal component analysis (PPCA), are ill-equipped to handle such incomplete, high-dimensional longitudinal data, as they fail to account for the nested sources of variation and temporal dependency inherent in repeated measures. We introduce hierarchical probabilistic principal component analysis (HPPCA), a two-level probabilistic factor model that explicitly separates between-subject variance from time-varying within-subject dynamics. The within-subject latent factors are modeled by a Gaussian process. We develop an EM algorithm to handle missing data and flexible covariance kernels, accelerated by computationally efficient initializers. Simulation studies demonstrated that HPPCA robustly recovers model parameters subspaces and substantially outperforms both standard PPCA and multivariate functional PCA in imputation accuracy, even under heavy missingness and model misspecification. An application to the long COVID symptoms in the Researching COVID to Enhance Recovery adult cohort revealed that HPPCA effectively captured the data's hierarchical structure and its learned features significantly improved the prediction of clinical outcomes and the recovery of masked clinical records compared to exisiting methods.

artificial intelligence, machine learning, matrix, (17 more...)

arXiv.org Machine Learning

2604.22015

Country: North America > United States (0.46)

Genre:

Research Report > New Finding (0.66)
Research Report > Experimental Study (0.66)

Industry:

Health & Medicine > Therapeutic Area > Neurology (1.00)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)
(2 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Principal Component Analysis (0.81)

Add feedback

A Robust SINDy Autoencoder for Noisy Dynamical System Identification

Ding, Kairui

arXiv.org Machine LearningApr-7-2026

Sparse identification of nonlinear dynamics (SINDy) has been widely used to discover the governing equations of a dynamical system from data. It uses sparse regression techniques to identify parsimonious models of unknown systems from a library of candidate functions. Therefore, it relies on the assumption that the dynamics are sparsely represented in the coordinate system used. To address this limitation, one seeks a coordinate transformation that provides reduced coordinates capable of reconstructing the original system. Recently, SINDy autoencoders have extended this idea by combining sparse model discovery with autoencoder architectures to learn simplified latent coordinates together with parsimonious governing equations. A central challenge in this framework is robustness to measurement error. Inspired by noise-separating neural network structures, we incorporate a noise-separation module into the SINDy autoencoder architecture, thereby improving robustness and enabling more reliable identification of noisy dynamical systems. Numerical experiments on the Lorenz system show that the proposed method recovers interpretable latent dynamics and accurately estimates the measurement noise from noisy observations.

artificial intelligence, dim, machine learning, (17 more...)

arXiv.org Machine Learning

2604.04829

Country: North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)

Genre: Research Report (0.82)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Bayesian Layers: A Module for Neural Network Uncertainty

Dustin Tran, Mike Dusenberry, Mark van der Wilk, Danijar Hafner

Neural Information Processing SystemsFeb-11-2026, 13:27:30 GMT

We describe Bayesian Layers, a module designed for fast experimentation with neural network uncertainty. It extends neural network libraries with drop-in replacementsforcommonlayers.

artificial intelligence, deep learning, machine learning, (19 more...)

Neural Information Processing Systems

Country:

North America > Canada > Ontario > Toronto (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.70)

Add feedback

24aef8cb3281a2422a59b51659f1ad2e-Supplemental.pdf

Neural Information Processing SystemsFeb-7-2026, 20:23:43 GMT

activation, initializer, keras, (12 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.36)

Add feedback

The Station: An Open-World Environment for AI-Driven Discovery

Chung, Stephen, Du, Wenyu

arXiv.org Artificial IntelligenceDec-2-2025

We introduce the STATION, an open-world multi-agent environment for autonomous scientific discovery. The Station simulates a complete scientific ecosystem, where agents can engage in long scientific journeys that include reading papers from peers, formulating hypotheses, collaborating with peers, submitting experiments, and publishing results. Importantly, there is no centralized system coordinating their activities. Utilizing their long context, agents are free to choose their own actions and develop their own narratives within the Station. Experiments demonstrate that AI agents in the Station achieve new state-of-the-art performance on a wide range of benchmarks, spanning mathematics, computational biology, and machine learning, notably surpassing AlphaEvolve in circle packing. A rich tapestry of unscripted narratives emerges, such as agents collaborating and analyzing other works rather than pursuing myopic optimization. From these emergent narratives, novel methods arise organically, such as a new density-adaptive algorithm for scRNA-seq batch integration that borrows concepts from another domain. The Station marks a first step towards autonomous scientific discovery driven by emergent behavior in an open-world environment, representing a new paradigm that moves beyond rigid pipelines.

artificial intelligence, deep learning, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2511.06309

Genre: Research Report > New Finding (1.00)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.66)

Add feedback

The Splendors and Miseries of Heavisidisation

Dolotin, V., Morozov, A.

arXiv.org Artificial IntelligenceNov-10-2025

Machine Learning (ML) is applicable to scientific problems, i.e. to those which have a well defined answer, only if this answer can be brought to a peculiar form ${\cal G}: X\longrightarrow Z$ with ${\cal G}(\vec x)$ expressed as a combination of iterated Heaviside functions. At present it is far from obvious, if and when such representations exist, what are the obstacles and, if they are absent, what are the ways to convert the known formulas into this form. This gives rise to a program of reformulation of ordinary science in such terms -- which sounds like a strong enhancement of the constructive mathematics approach, only this time it concerns all natural sciences. We describe the first steps on this long way.

artificial intelligence, formula, machine learning, (17 more...)

arXiv.org Artificial Intelligence

doi: 10.1142/S0217751X25501751

2205.07377

Genre:

Research Report (0.50)
Workflow (0.48)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Warm-Starting Optimization-Based Motion Planning for Robotic Manipulators via Point Cloud-Conditioned Flow Matching

Tian, Sibo, Zheng, Minghui, Liang, Xiao

arXiv.org Artificial IntelligenceOct-7-2025

Rapid robot motion generation is critical in Human-Robot Collaboration (HRC) systems, as robots need to respond to dynamic environments in real time by continuously observing their surroundings and replanning their motions to ensure both safe interactions and efficient task execution. Current sampling-based motion planners face challenges in scaling to high-dimensional configuration spaces and often require post-processing to interpolate and smooth the generated paths, resulting in time inefficiency in complex environments. Optimization-based planners, on the other hand, can incorporate multiple constraints and generate smooth trajectories directly, making them potentially more time-efficient. However, optimization-based planners are sensitive to initialization and may get stuck in local minima. In this work, we present a novel learning-based method that utilizes a Flow Matching model conditioned on a single-view point cloud to learn near-optimal solutions for optimization initialization. Our method does not require prior knowledge of the environment, such as obstacle locations and geometries, and can generate feasible trajectories directly from single-view depth camera input. Simulation studies on a UR5e robotic manipulator in cluttered workspaces demonstrate that the proposed generative initializer achieves a high success rate on its own, significantly improves the success rate of trajectory optimization compared with traditional and learning-based benchmark initializers, requires fewer optimization iterations, and exhibits strong generalization to unseen environments.

artificial intelligence, machine learning, optimization, (19 more...)

arXiv.org Artificial Intelligence

2510.0346

Country: North America > United States > Texas > Brazos County > College Station (0.14)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Robots > Robot Planning & Action (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

24aef8cb3281a2422a59b51659f1ad2e-Supplemental.pdf

Neural Information Processing SystemsOct-2-2025, 11:53:30 GMT

artificial intelligence, initializer, machine learning, (13 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.95)

Add feedback

Bayesian Layers: A Module for Neural Network Uncertainty

Dustin Tran, Mike Dusenberry, Mark van der Wilk, Danijar Hafner

Neural Information Processing SystemsOct-2-2025, 05:00:55 GMT

We describe Bayesian Layers, a module designed for fast experimentation with neural network uncertainty. It extends neural network libraries with drop-in replacements for common layers. This enables composition via a unified abstraction over deterministic and stochastic functions and allows for scalability via the underlying system. These layers capture uncertainty over weights (Bayesian neural nets), pre-activation units (dropout), activations ("stochastic output layers"), or the function itself (Gaussian processes). They can also be reversible to propagate uncertainty from input to output. We include code examples for common architectures such as Bayesian LSTMs, deep GPs, and flow-based models. As demonstration, we fit a 5-billion parameter "Bayesian Transformer" on 512 TPUv2 cores for uncertainty in machine translation and a Bayesian dynamics model for model-based planning. Finally, we show how Bayesian Layers can be used within the Edward2 language for probabilistic programming with stochastic processes.

artificial intelligence, bayesian layer, machine learning, (16 more...)

Neural Information Processing Systems

Country: North America > Canada (0.28)

Industry: Health & Medicine (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback