coupling
Score-basedGenerativeNeuralNetworksfor Large-ScaleOptimalTransport
Comparison of statistical distances can also enable distribution testing, quantification of distribution shifts, and provide methods to correct for distribution shift through domainadaptation[12]. Optimal transport theory provides a rich set of tools for comparing distributions inWasserstein Distance.
Boltzmann Machine Learning with a Parallel, Persistent Markov chain Monte Carlo method for Estimating Evolutionary Fields and Couplings from a Protein Multiple Sequence Alignment
The inverse Potts problem for estimating evolutionary single-site fields and pairwise couplings in homologous protein sequences from their single-site and pairwise amino acid frequencies observed in their multiple sequence alignment would be still one of useful methods in the studies of protein structure and evolution. Since the reproducibility of fields and couplings are the most important, the Boltzmann machine method is employed here, although it is computationally intensive. In order to reduce computational time required for the Boltzmann machine, parallel, persistent Markov chain Monte Carlo method is employed to estimate the single-site and pairwise marginal distributions in each learning step. Also, stochastic gradient descent methods are used to reduce computational time for each learning. Another problem is how to adjust the values of hyperparameters; there are two regularization parameters for evolutionary fields and couplings. The precision of contact residue pair prediction is often used to adjust the hyperparameters. However, it is not sensitive to these regularization parameters. Here, they are adjusted for the fields and couplings to satisfy a specific condition that is appropriate for protein conformations. This method has been applied to eight protein families.
- North America > United States (0.05)
- Asia > Singapore (0.04)
- Asia > Japan (0.04)
Beyond Augmented-Action Surrogates for Multi-Expert Learning-to-Defer
Montreuil, Yannis, Carlier, Axel, Ng, Lai Xing, Ooi, Wei Tsang
Existing multi-expert learning-to-defer surrogates are statistically consistent, yet they can underfit, suppress useful experts, or degrade as the expert pool grows. We trace these failures to a shared architectural choice: casting classes and experts as actions inside one augmented prediction geometry. Consistency governs the population target; it says nothing about how the surrogate distributes gradient mass during training. We analyze five surrogates along both axes and show that each trades a fix on one for a failure on the other. We then introduce a decoupled surrogate that estimates the class posterior with a softmax and each expert utility with an independent sigmoid. It admits an $\mathcal{H}$-consistency bound whose constant is $J$-independent for fixed per-expert weight $β{=}λ/J$, and its gradients are free of the amplification, starvation, and coupling pathologies of the augmented family. Experiments on synthetic benchmarks, CIFAR-10, CIFAR-10H, and Covertype confirm that the decoupled surrogate is the only method that avoids amplification under redundancy, preserves rare specialists, and consistently improves over a standalone classifier across all settings.
- Asia > Singapore (0.04)
- North America > United States (0.04)
- Europe > France > Occitanie > Haute-Garonne > Toulouse (0.04)
Cross-Spectral Witness for Hidden Nonequilibrium Beyond the Scalar Ceiling
Partial observation is a pervasive obstacle in nonequilibrium physics: coarse graining may absorb hidden forcing into an apparently equilibrium-like reduced description, so a driven system can look reversible through the only variables one can measure. For scalar Gaussian observables of linear stochastic systems, no time-irreversibility statistic can detect the underlying drive. The Lucente--Crisanti ceiling constrains what one channel carries; what two channels carry is a different question, with a sharp closed-form answer. Two simultaneously observed channels retain an off-diagonal cross-spectral sector inaccessible to any scalar reduction; under channel-separable multiplicative structure the observed-channel response factors cancel identically, leaving a closed-form cross-spectral witness controlled only by the hidden spectrum, the loadings, and the innovation scales, strictly positive at every nonzero cross-coupling including at exact timescale coalescence where every scalar reduction is blind. Within general CSM this certifies shared hidden-sector drive; under the additional one-way coupling assumption the witness identifies the total entropy production rate at leading order with a square-root scaling.
- North America > United States > Georgia > Fulton County > Atlanta (0.04)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Generating DDPM-based Samples from Tilted Distributions
Mandal, Himadri, Gupta, Dhruman, Gupta, Rushil, Iyer, Sarvesh Ravichandran, Bandyopadhyay, Agniv, Bassamboo, Achal, Gupta, Varun, Juneja, Sandeep
Given $n$ independent samples from a $d$-dimensional probability distribution, our aim is to generate diffusion-based samples from a distribution obtained by tilting the original, where the degree of tilt is parametrized by $θ\in \mathbb{R}^d$. We define a plug-in estimator and show that it is minimax-optimal. We develop Wasserstein bounds between the distribution of the plug-in estimator and the true distribution as a function of $n$ and $θ$, illustrating regimes where the output and the desired true distribution are close. Further, under some assumptions, we prove the TV-accuracy of running Diffusion on these tilted samples. Our theoretical results are supported by extensive simulations. Applications of our work include finance, weather and climate modelling, and many other domains, where the aim may be to generate samples from a tilted distribution that satisfies practically motivated moment constraints.
- Africa > Rwanda > Kigali > Kigali (0.04)
- North America > United States > Utah (0.04)
- North America > United States > New York (0.04)
- (3 more...)
Understanding Behavior Cloning with Action Quantization
Behavior cloning is a fundamental paradigm in machine learning, enabling policy learning from expert demonstrations across robotics, autonomous driving, and generative models. Autoregressive models like transformer have proven remarkably effective, from large language models (LLMs) to vision-language-action systems (VLAs). However, applying autoregressive models to continuous control requires discretizing actions through quantization, a practice widely adopted yet poorly understood theoretically. This paper provides theoretical foundations for this practice. We analyze how quantization error propagates along the horizon and interacts with statistical sample complexity. We show that behavior cloning with quantized actions and log-loss achieves optimal sample complexity, matching existing lower bounds, and incurs only polynomial horizon dependence on quantization error, provided the dynamics are stable and the policy satisfies a probabilistic smoothness condition. We further characterize when different quantization schemes satisfy or violate these requirements, and propose a model-based augmentation that provably improves the error bound without requiring policy smoothness. Finally, we establish fundamental limits that jointly capture the effects of quantization error and statistical complexity.
- North America > United States > Wisconsin > Dane County > Madison (0.40)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Objective and efficient inference for couplings in neuronal networks
Inferring directional couplings from the spike data of networks is desired in various scientific fields such as neuroscience. Here, we apply a recently proposed objective procedure to the spike data obtained from the Hodgkin-Huxley type models and in vitro neuronal networks cultured in a circular structure. As a result, we succeed in reconstructing synaptic connections accurately from the evoked activity as well as the spontaneous one. To obtain the results, we invent an analytic formula approximately implementing a method of screening relevant couplings. This significantly reduces the computational cost of the screening method employed in the proposed objective procedure, making it possible to treat large-size systems as in this study.
- North America > United States > New York (0.04)
- Europe > United Kingdom (0.04)
- North America > United States > Maryland > Baltimore (0.04)
- (5 more...)
- North America > United States > New Jersey > Mercer County > Princeton (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)
- (2 more...)
- Research Report > New Finding (0.93)
- Research Report > Experimental Study (0.92)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
- Information Technology > Mathematics of Computing (0.68)
- Information Technology > Data Science (0.67)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.48)