Goto

Collaborating Authors

 Industry


Theory of learning of high-dimensional controlled non-linear dynamical systems (I): models and methods

arXiv.org Machine Learning

Neural ordinary differential equations (neural ODEs) have rapidly gained prominence as a powerful and unifying framework for conceptualizing artificial neural networks, elegantly connecting the continuous-time modeling of dynamical systems with the discrete, data-driven paradigm of modern deep learning. Beyond their practical advantages they offer fresh theoretical insights into the training and generalization properties of neural networks. The distinctive feature of this framework is its dual dynamical nature: inference dynamics, which govern the ODE evolution during forward computation, and training dynamics, which control the optimization of model parameters. This makes neural ODEs a particularly well-suited theoretical framework for studying a large variety of settings such as multi-layer neural networks (ResNets for example), autoregressive models (with next-token generation dynamics), generative models, and recurrent neural networks in theoretical neuroscience. In this work, we introduce a theoretically grounded class of models for studying neural ODEs trained via online stochastic gradient descent. We solve the training dynamics of these models via dynamical mean field theory and derive learning curves in the high-dimensional limit.


Diffusion-Network Alignment: An Efficient Algorithm and Explicit Probability Bounds

arXiv.org Machine Learning

This paper studies a variation of the classic network alignment problem, named diffusion-network alignment. The goal is to align the vertices of a rooted diffusion tree to the vertices of a network, where the diffusion tree could be from a communication trace or contact tracing, and the network could be an online or offline social network. Different from the classic network alignment where both networks are fully observed, this model captures the information asymmetry of two networks. To solve this problem, this paper presents an efficient algorithm based on tree correlation tests to extract alignment information from local neighborhoods. We analyze the performance of the algorithm in the sparse graph regime and show that with high probability, all matched pairs are correct. Furthermore, for each vertex on the diffusion tree, this paper establishes an explicit lower bound on the probability that the vertex is correctly matched. These lower bounds are depth-dependent and increase as vertices get closer to the root.


Counterfactual Explanations for Deep Two-Sample Testing

arXiv.org Machine Learning

Two-sample testing is a fundamental tool for detecting distributional differences across scientific domains, but classical tests (including kernel-based tests) can be ineffective on high-dimensional structured data such as images. Recent deep two-sample tests improve sensitivity in these settings by learning informative representations, yet they provide limited insight into which data features drive rejection of the null hypothesis $H_0$. To address this issue, we propose a counterfactual explanation framework for deep two-sample testing that generates sample-level edits moving observations from a source group toward a target group while explicitly reducing the discrepancy measured by the test. Our method combines a diffusion autoencoder with a pretrained deep two-sample test model and optimizes a maximum mean discrepancy (MMD) objective in the test model's representation space to produce plausible counterfactuals. We quantify distribution-level effects through changes in the test statistic and the resulting two-sample p-values. We evaluate the method on synthetic 2D shape datasets and two MRI cohorts. Across both settings, the counterfactual transformations consistently increase p-values relative to the original samples, indicating that the edited source set becomes statistically closer to the target distribution under the test. We measure minimality using LPIPS to ensure the counterfactuals remain close to the original samples. The resulting edits provide interpretable evidence of the features associated with the detected group differences. On MRI, the localized changes are consistent with known anatomical differences between cohorts.


Physics-Informed Neural Networks for Chemotherapy Pharmacokinetics: Benchmarking the Clinical Estimator and Exposing Parameter Identifiability

arXiv.org Machine Learning

Physics-Informed Neural Networks (PINNs) are an attractive tool for partial-observation problems in biology, where the governing dynamics are known but some compartments cannot be measured. Chemotherapy pharmacokinetics (PK) is a clean instance: drug concentration in plasma is routinely measured, but concentration in tissue -- which determines tumour kill and off-target toxicity -- is not. We benchmark a PINN against the standard clinical baseline (nonlinear least-squares on the analytical biexponential plasma solution, hereafter NLS) and a physics-agnostic neural baseline (a data-only MLP) on two PK problems. On the linear two-compartment problem, NLS is near-optimal; the PINN matches it to within a small constant factor while also producing the tissue curve in a single training pass, whereas the data-only MLP fails on tissue by roughly 10x. On a Michaelis-Menten extension (saturable elimination), the biexponential closed form no longer exists, so NLS is mis-specified and silently returns meaningless rate constants. The PINN instead exposes a deeper fact: the Michaelis-Menten two-compartment model is non-identifiable from plasma alone, and the PINN reports this honestly by converging to a basin with k12 -> 0. Adding two sparse tissue observations largely resolves identifiability: across five seeds the PINN recovers k21 to within 1% of truth and Vmax, Km to within one standard-deviation bar, while k12 moves in the correct direction (0.02 -> 0.82) but remains ~2 sigma below truth -- a recovery the closed-form NLS estimator cannot attempt at all, because its biexponential ansatz describes only plasma. Our claim is not that PINNs beat NLS. It is that PINNs offer a uniform recipe that ties the textbook estimator on the textbook problem, exposes structural identifiability that the textbook estimator hides, and absorbs heterogeneous measurements within a single loss.


Towards More General Control of Diffusion Models Using Jeffrey Guidance

arXiv.org Machine Learning

A key strength of diffusion models lies in their flexibility, since their outputs can be controlled at sampling time through guidance. However, beyond simple cases such as conditional sampling, the target distribution is often left implicit, defined only through a sampling rule or a heuristic energy function. To address this, we propose Jeffrey guidance, a principled framework that extends diffusion-model control to applications beyond what standard guidance can express. It leverages Jeffrey's rule of conditioning to update marginal distributions towards a prescribed target, preserving the conditional structure and minimally perturbing the joint distribution. We first demonstrate Jeffrey guidance by targeting a prescribed embedding distribution. With Inception embeddings as the target, this leads to substantial reductions in FID on both CIFAR-10 and FFHQ. We further apply Jeffrey guidance to fairness on CelebA-HQ, updating an unconditional diffusion model to enforce independence between attributes.


Why You Might Already Own SpaceX Shares, Siri's AI Makeover, and Knicks Owner's Surveillance Machine

WIRED

Today on, we take an early look at the SpaceX IPO and why you might find yourself among the investors without even realizing it. This week on, our hosts discuss SpaceX officially going public and who will benefit the most from it, as well as Apple's WWDC and the brand-new release of Siri AI. They also get into how Meta removed a face-recognition feature after a WIRED report exposed it--and later in the show: an investigation into how New York Knicks' owner James Dolan created an extensive surveillance system inside all of his Madison Square Garden properties. Write to us at [email protected] . You can always listen to this week's podcast through the audio player on this page, but if you want to subscribe for free to get every episode, here's how: If you're on an iPhone or iPad, open the app called Podcasts, or just tap this link . Before we start, two quick things. If you've been enjoying listening to the show, would appreciate it if you took a second to rate it in your app of choice. It really helps us reach more people. Second, if you have any questions related to tech, privacy, or politics that you would like me, Zoë, and Leah to take on, now is the time to submit them to [email protected] . It doesn't matter how big or how small, we want to hear from you and get you answers. I'm a little tired, but it's because I got to see Lionel Messi play soccer last night and score a goal on a penalty kick. It was a friendly of Argentina versus Iceland. You'll never guess who won. Is that an obvious thing? It's far from their first attempt, but it's going to stick this time. We're also taking an early look at the SpaceX IPO this week, which is slated to become the world's largest IPO of all time. We'll get into who is slated to benefit the most. Elon Musk, who is already the world's richest man, but on track to become even richer and why you might find yourself among the investors without even realizing it. And in case you missed it, WIRED reporters recently uncovered that Meta had silently embedded code that would power a face-recognition system for its smart classes in the Meta AI app on millions of people's phones.


Musk's 1.8 trillion SpaceX IPO could be 'highly undesirable' for some

Al Jazeera

Musk's $1.8 trillion SpaceX IPO could be'highly undesirable' for some SpaceX is expected to debut on the United States' public markets on Friday in what will be the largest initial public offering (IPOs). Artificial intelligence (AI) giants OpenAI and Anthropic are also widely expected to go public soon, and thanks to a new rule change by tech stock exchange Nasdaq, individual investors could own stock of these companies when they go public in as soon as 15 business days following its first trading day. SpaceX's IPO is generating buzz among retail investors. The Elon Musk-led company is expected to allocate 20 percent of shares to retail investors and has drawn roughly $70bn in orders, according to the Reuters news agency. Historically, there is a waiting period between when a company goes public and when it is listed on the Nasdaq-100 index and/or S&P 500.


PSI: A Benchmark for Human Interpretation and Response in Traffic Interactions

Neural Information Processing Systems

Accurately modeling pedestrian intention and understanding driver decision-making processes are critical for the development of safe and socially aware autonomous driving systems. However, existing datasets primarily emphasize observable behavior, offering limited insight into the underlying causal reasoning that informs human interpretation and response during traffic interactions. To address this gap, we introduce PSI, a benchmark dataset that captures the dynamic evolution of pedestrian crossing intentions from the driver's perspective, enriched with human-annotated textual explanations that reflect the reasoning behind intention estimation and driving decision making. These annotations offer a unique foundation for developing and benchmarking models that combine predictive performance with interpretable and human-aligned reasoning. PSI supports standardized tasks and evaluation protocols across multiple dimensions, including pedestrian intention prediction, driver decision modeling, reasoning generation, and trajectory forecasting and more. By enabling causal and interpretable evaluation, PSI advances research toward autonomous systems that can reason, act, and explain in alignment with human cognitive processes.


SensorLM: Learning the Language of Wearable Sensors

Neural Information Processing Systems

We present SensorLM, a family of sensor-language foundation models that enable wearable sensor data understanding with natural language. Despite its pervasive nature, aligning and interpreting sensor data with language remains challenging due to the lack of paired, richly annotated sensor-text descriptions in uncurated, real-world wearable data. We introduce a hierarchical caption generation pipeline designed to capture statistical, structural, and semantic information from sensor data. This approach enabled the curation of the largest sensor-language dataset to date, comprising over 59.7 million hours of data from more than 103,000 people. Furthermore, SensorLM extends prominent multimodal pretraining architectures (e.g., CLIP, CoCa) and recovers them as specific variants within a generic architecture. Extensive experiments on real-world tasks in human activity analysis and healthcare verify the superior performance of SensorLM over state-of-the-art in zero-shot recognition, few-shot learning, and cross-modal retrieval. SensorLM also demonstrates intriguing capabilities including scaling behaviors, label efficiency, sensor captioning, and zero-shot generalization to unseen tasks.


'Hands Off Our NHS': Anti-Palantir Protests Break Out in UK Over Deal With National Health Service

WIRED

Crowding the gates of a major health care conference, protesters called for Palantir to be booted out of the UK's National Health Service over privacy concerns and political grievances. Protesters wearing hospital gowns and wielding signs gathered outside a UK health care conference on Thursday to object to a deal between the country's National Health Service and American software company Palantir . At 8 am local time, the group, around 80 people in total, crowded the entryway to the NHS ConfedExpo in Manchester. They wanted to appeal to NHS leadership to terminate a contract worth up to $440 million over concerns around national security, data privacy, and the company's political affiliations . The contract, which includes access to Palantir's data analytics and artificial intelligence services, is intended to run until 2031 but includes a break clause that permits the government to withdraw the agreement next February.