Goto

Collaborating Authors

 application


I've spent 30 years in recruitment - this is how to get a job

BBC News

I've spent 30 years in recruitment - this is how to get a job If you've sent off dozens of job applications and heard nothing back, the silence can be as infuriating as a rejection. Part of the problem is the shrinking number of entry-level jobs. Reed, the recruitment firm, says that graduate vacancies on its website have fallen from around 180,000 three or four years ago to 50,000. James Reed, chair and chief executive of Reed, has spent 30 years watching how employers make decisions and, like many, is frustrated at how difficult the process has become. Here, the recruitment veteran gives some pointers on how to get noticed in a tough jobs market.


Differentiable Generalized Sliced Wasserstein Plans

Neural Information Processing Systems

Optimal Transport (OT) has attracted significant interest in the machine learning community, not only for its ability to define meaningful distances between probability distributions - such as the Wasserstein distance - but also for its formulation of OT plans. Its computational complexity remains a bottleneck, though, and slicing techniques have been developed to scale OT to large datasets. Recently, a novel slicing scheme, dubbed min-SWGG, lifts a single one-dimensional plan back to the original multidimensional space, finally selecting the slice that yields the lowest Wasserstein distance as an approximation of the full OT plan. Despite its computational and theoretical advantages, min-SWGG inherits typical limitations of slicing methods: (i) the number of required slices grows exponentially with the data dimension, and (ii) it is constrained to linear projections. Here, we reformulate min-SWGG as a bilevel optimization problem and propose a differentiable approximation scheme to efficiently identify the optimal slice, even in high-dimensional settings. We furthermore define its generalized extension for accommodating to data living on manifolds. Finally, we demonstrate the practical value of our approach in various applications, including gradient flows on manifolds and highdimensional spaces, as well as a novel sliced OT-based conditional flow matching for image generation - where fast computation of transport plans is essential.


Small Resamples, Sharp Guarantees: Convergence Rates for Resampled Studentized Quantile Estimators

Neural Information Processing Systems

The m-out-of-n bootstrap--proposed by Bickel et al. [1992]--approximates the distribution of a statistic by repeatedly drawing msubsamples (m n) without replacement from an original sample of size n; it is now routinely used for robust inference with heavy-tailed data, bandwidth selection, and other large-sample applications. Despite this broad applicability across econometrics, biostatistics, and machine-learning workflows, rigorous parameter-free guarantees for the soundness of the m-out-of-n bootstrap when estimating sample quantiles have remained elusive. This paper establishes such guarantees by analysing the estimator of sample quantiles obtained from m-out-of-n resampling of a dataset of length n. We first prove a central limit theorem for a fully data-driven version of the estimator that holds under a mild moment condition and involves no unknown nuisance parameters. We then show that the moment assumption is essentially tight by constructing a counter-example in which the CLT fails. Strengthening the assumptions slightly, we derive an Edgeworth expansion that delivers exact convergence rates and, as a corollary, a Berry-Esséen bound on the bootstrap approximation error. Finally, we illustrate the scope of our results by obtaining parameter-free asymptotic distributions for practical statistics, including the quantiles for random walk MH, and rewards of ergodic MDP's, thereby demonstrating the usefulness of our theory in modern estimation and learning tasks.


Acceleration via silver stepsize on Riemannian manifolds with applications to Wasserstein space

Neural Information Processing Systems

There is extensive literature on accelerating first-order optimization methods in an Euclidean setting. Under which conditions such acceleration is feasible in Riemannian optimization problems is an active area of research. Motivated by the recent success of silver stepsize methods in the Euclidean setting, we undertake a study of such algorithms in the Riemannian setting. We provide the new class of algorithms determined by the choice of vector transport that allows the silver stepsize acceleration on Riemannian manifolds for the function classes associated with the corresponding vector transport. As a core application, we show that our algorithm recovers the standard Wasserstein gradient descent on the 2-Wasserstein space and, as a result, provides the first provable accelerated gradient method for potential functional optimization problems in the Wasserstein space.


The Temporal Graph of Bitcoin Transactions

Neural Information Processing Systems

Since its 2009 genesis block, the Bitcoin network has processed >1.08 billion (B) transactions representing >8.72BBTC, offering rich potential for machine learning (ML); yet, its pseudonymity and obscured flow of funds inherent in its UTxO-based design, have rendered this data largely inaccessible for ML research. Addressing this gap, we present an ML-compatible graph modeling the Bitcoin's economic topology by reconstructing the flow of funds. This temporal, heterogeneous graph encompasses complete transaction history up to block 863000, consisting of >2.4B nodes and >39.72B edges. Additionally, we provide custom sampling methods yielding node and edge feature vectors of sampled communities, tools to load and analyze the Bitcoin graph data within specialized graph databases, and ready-to-use database snapshots. This comprehensive dataset and toolkit empower the ML community to tackle Bitcoin's intricate ecosystem at scale, driving progress in applications such as anomaly detection, address classification, market analysis, and large-scale graph ML benchmarking.


Vertical Federated Feature Screening

Neural Information Processing Systems

With the rapid development of the big data era, Vertical Federated Learning (VFL) has been widely applied to enable data collaboration while ensuring privacy protection. However, the ultrahigh dimensionality of features and the sparse data structures inherent in large-scale datasets introduce significant computational complexity. In this paper, we propose the Vertical Federated Feature Screening (VFS) algorithm, which effectively reduces computational, communication, and encryption costs. VFS is a two-stage feature screening procedure that proceeds from coarse to fine: the first stage quickly filters out irrelevant feature groups, followed by a more refined screening of individual features. It significantly reduces the resource demands of downstream tasks such as secure joint modeling or federated feature selection. This efficiency is particularly beneficial in scenarios with ultrahigh feature dimensionality or severe class imbalance in the response variable. The statistical and computational properties of VFS are rigorously established. Numerical simulations and real-world applications demonstrate its superior performance.


PhysioWave: AMulti-Scale Wavelet-Transformer for Physiological Signal Representation

Neural Information Processing Systems

Physiological signals are often corrupted by motion artifacts, baseline drift, and other low-SNR disturbances, which pose significant challenges for analysis. Additionally, these signals exhibit strong non-stationarity, with sharp peaks and abrupt changes that evolve continuously, making them difficult to represent using traditional time-domain or filtering methods. To address these issues, a novel waveletbased approach for physiological signal analysis is presented, aiming to capture multi-scale time-frequency features in various physiological signals. Leveraging this technique, two large-scale pretrained models specific to EMG and ECG are introduced for the first time, achieving superior performance and setting new baselines in downstream tasks. Additionally, a unified multi-modal framework is constructed by integrating pretrained EEG model, where each modality is guided through its dedicated branch and fused via learnable weighted fusion. This design effectively addresses challenges such as low signal-to-noise ratio, high inter-subject variability, and device mismatch, outperforming existing methods on multi-modal tasks. The proposed wavelet-based architecture lays a solid foundation for analysis of diverse physiological signals, while the multi-modal design points to nextgeneration physiological signal processing with potential impact on wearable health monitoring, clinical diagnostics, and broader biomedical applications.


Principled Fine-tuning of LLMs from User-Edits: AMedley of Preference, Supervision, and Reward

Neural Information Processing Systems

We study how to fine-tune LLMs using user-edit deployment data consisting of a set of context, an agent's response, and user edits. This deployment data is naturally generated by users in applications such as LLMs-based writing assistants and coding agents. The natural origin of user edits makes it a desired source for adapting and personalizing of LLMs. In this setup, there emerges a unification of various feedback types namely preferences, supervised labels, and cost that are typically studied separately in the literature. In this paper, we initiate the theoretical investigation of learning from user edits.


Fast Solvers for Discrete Diffusion Models: Theory and Applications of High-Order Algorithms

Neural Information Processing Systems

Discrete diffusion models have emerged as a powerful generative modeling framework for discrete data with successful applications spanning from text generation to image synthesis. However, their deployment faces challenges due to the high dimensionality of the state space, necessitating the development of efficient inference algorithms. Current inference approaches mainly fall into two categories: exact simulation and approximate methods such as τ-leaping. While exact methods suffer from unpredictable inference time and redundant function evaluations, τ-leaping is limited by its first-order accuracy. In this work, we advance the latter category by tailoring the first extension of high-order numerical inference schemes to discrete diffusion models, enabling larger step sizes while reducing error. We rigorously analyze the proposed schemes and establish the second-order accuracy of the θ-Trapezoidal method in KL divergence. Empirical evaluations on GSM8Klevel math-reasoning, GPT-2-level text, and ImageNet-level image generation tasks demonstrate that our method achieves superior sample quality compared to existing approaches under equivalent computational constraints, with consistent performance gains across models ranging from 200M to 8B.


Functional data analysis for multivariate distributions through Wasserstein slicing

Neural Information Processing Systems

The modeling of samples of distributions is a major challenge since distributions do not form a vector space. While various approaches exist for univariate distributions, including transformations to a Hilbert space, far less is known about the multivariate case. We utilize a transformation approach to map multivariate distributions to a Hilbert space via a Wasserstein slicing method that is invertible. This approach combines functional data analysis tools, such as functional principal component analysis and modes of variation, with the facility to map back to interpretable distributions. We also provide convergence guarantees for the Hilbert space representations under a broad class of such transforms. The method is illustrated using joint systolic and diastolic blood pressure data.