Goto

Collaborating Authors

 special case


Latent Process Generator Matching

arXiv.org Machine Learning

A related situation arises when an auxiliary process is introduced to aid training but modelling its dynamics at generation time is unnecessary or difficult, as in Billera et al. [2025b] and Kim et al. [2025]. In each of these works, the projection result and its associated loss are derived on a case-by-case basis, and all theorems are restricted to marginalization over a discrete component of the extended state space. We introduce a general framework that removes these restrictions: given a time-inhomogeneous Feller process (Yt)0 t 1 on an arbitrary state space Y and a map ฮฆ: Y X, one may learn a linear parametrisation of the generator of a Feller process on X whose one-time marginals coincide with those of (ฮฆ(Yt))0 t 1. For Y = X Z and ฮฆthe projection onto the first coordinate, this subsumes these prior works as special cases, allowing for a general class of latent processes (Zt)0 t 1 in a nearly arbitrary state space Z, using the formalism of generator matching to allow for continuous, discrete, or manifold-valued processes. In particular, the learnt process at t = 1 samples from the distribution of ฮฆ(Y1), which is the desired data distribution. We give sufficient conditions for a loss function to be valid in this general setting, recovering the results of the works cited above as corollaries. This result has broad applicability, enabling the construction of a wide array of new flow matching schemes by allowing for a more general class of latent spaces. As a concrete new application, we outline a non-projection ฮฆ: Y X with manifold-valued latents for protein structure generation that separates chain-level rigid-body motion from internal flexibility ( 4), where the particular chain-level versus residue-level or internal state is latent, and the model only sees the world state, which we plan to implement in future work. 2 EARLIERWORK Several recent generative models train with the aid of a latent stochastic process that is marginalised out at generation time.


Disease Is a Spectral Perturbation

arXiv.org Machine Learning

We propose a novel method of understanding disease transformation from a healthy baseline with biomarker-level explainability. By modeling the biomarker covariance matrices of healthy controls and disease states, the perturbation can be individually characterized to accomplish mechanistic explanations of disease trajectories, both at a molecular level and for individual patients. Given a cohort of n patients each measured on p biomarkers, we define the biomarker "Hamiltonian" H = X^T X / n \in R^{p \times p}, where X \in R^{n \times p} is the covariant biomarker matrix. The eigenvectors of H define a set of normal modes of biomarker coordination, and the eigenvalues quantify the energy carried by each mode. In the healthy state, the reference Hamiltonian H_0 governs this structure where disease perturbs H_0 by an additive operator ฮ”H, thus shifting eigenvalues and rotating eigenvectors in proportion to the severity of pathological disruption. We formalize this framework, derive the spectral change given a disease perturbation, and demonstrate that the projection of a newly diagnosed patient's cumulative biomarker covariance structure onto disease-discriminant eigenmodes constitutes an optimal prognostic statistic for greater precision in disease prognosis. This work serves as a veritable white paper with application across a panoply of disease frameworks from cancer to neurodegenerative disorders.



Improved Guarantees for Offline Stochastic Matching via New Ordered Contention Resolution Schemes

Neural Information Processing Systems

Matching is one of the most fundamental and broadly applicable problems across many domains. In these diverse real-world applications, there is often a degree of uncertainty in the input which has led to the study of stochastic matching models. Here, each edge in the graph has a known, independent probability of existing derived from some prediction. Algorithms must probe edges to determine existence and match them irrevocably if they exist. Further, each vertex may have a patience constraint denoting how many of its neighboring edges can be probed. We present new ordered contention resolution schemes yielding improved approximation guarantees for some of the foundational problems studied in this area. For stochastic matching with patience constraints in general graphs, we provide a 0.382-approximate algorithm, significantly improving over the previous best 0.31-approximation of Baveja et al. (2018). When the vertices do not have patience constraints, we describe a 0.432-approximate random order probing algorithm with several corollaries such as an improved guarantee for the Prophet Secretary problem under Edge Arrivals. Finally, for the special case of bipartite graphs with unit patience constraints on one of the partitions, we show a 0.632-approximate algorithm that improves on the recent 1/3-guarantee of Hikima et al. (2021).


Learnability of Linear Thresholds from Label Proportions

Neural Information Processing Systems

We study the problem of properly learning linear threshold functions (LTFs) in the learning from label proportions (LLP) framework. In this, the learning is on a collection of bags of feature-vectors with only the proportion of labels available for each bag. First, we provide an algorithm that, given a collection of such bags each of size at most two whose label proportions are consistent with (i.e., the bags are satisfied by) an unknown LTF, efficiently produces an LTF that satisfies at least (2/5)-fraction of the bags. If all the bags are non-monochromatic (i.e., bags of size two with differently labeled feature-vectors) the algorithm satisfies at least (1/2)-fraction of them. For the special case of OR over the d-dimensional boolean vectors, we give an algorithm which computes an LTF achieving an additional โ„ฆ(1/d) in accuracy for the two cases.



Efficient Equivariant Network Supplementary Materials AMNIST-rot Model Architecture

Neural Information Processing Systems

Please refer to Table 5. Table 5: Architecture of E4-Net on Mnist-rot classification, p means dropout rate. The hyperparameters we use in this architecture are kernel size k = 5, reduction ratio r = 1, and the number of slices s = 2. In the large model, we increase the channel dimension to 24, the number of slices to 12, the reduction ratio to 2, and keep other hyperparameters the same. We take ResNet-18 [2], which is composed of an initial convolution layer, followed by 4 stage Res-Blocks and one final classification layer.



Hybrid Regret Bounds for Combinatorial Semi-Bandits and Adversarial Linear Bandits

Neural Information Processing Systems

This study aims to develop bandit algorithms that automatically exploit tendencies of certain environments to improve performance, without any prior knowledge regarding the environments. We first propose an algorithm for combinatorial semi-bandits with a hybrid regret bound that includes two main features: a bestof-three-worlds guarantee and multiple data-dependent regret bounds. The former means that the algorithm will work nearly optimally in all environments in an adversarial setting, a stochastic setting, or a stochastic setting with adversarial corruptions. The latter implies that, even if the environment is far from exhibiting stochastic behavior, the algorithm will perform better as long as the environment is "easy" in terms of certain metrics. The metrics w.r.t. the easiness referred to in this paper include cumulative loss for optimal actions, total quadratic variation of losses, and path-length of a loss sequence. We also show hybrid data-dependent regret bounds for adversarial linear bandits, which include a first path-length regret bound that is tight up to logarithmic factors.