Goto

Collaborating Authors

 dinger


Building the Bridge of Schrödinger: A Continuous Entropic Optimal Transport Benchmark

Neural Information Processing Systems

Over the last several years, there has been significant progress in developing neural solvers for the Schrödinger Bridge (SB) problem and applying them to generative modelling. This new research field is justifiably fruitful as it is interconnected with the practically well-performing diffusion models and theoretically grounded entropic optimal transport (EOT). Still, the area lacks non-trivial tests allowing a researcher to understand how well the methods solve SB or its equivalent continuous EOT problem. We fill this gap and propose a novel way to create pairs of probability distributions for which the ground truth OT solution is known by the construction. Our methodology is generic and works for a wide range of OT formulations, in particular, it covers the EOT which is equivalent to SB (the main interest of our study). This development allows us to create continuous benchmark distributions with the known EOT and SB solutions on high-dimensional spaces such as spaces of images. As an illustration, we use these benchmark pairs to test how well existing neural EOT/SB solvers actually compute the EOT solution.


Building the Bridge of Schrödinger: A Continuous Entropic Optimal Transport Benchmark

Neural Information Processing Systems

Over the last several years, there has been significant progress in developing neural solvers for the Schrödinger Bridge (SB) problem and applying them to generative modelling. This new research field is justifiably fruitful as it is interconnected with the practically well-performing diffusion models and theoretically grounded entropic optimal transport (EOT). Still, the area lacks non-trivial tests allowing a researcher to understand how well the methods solve SB or its equivalent continuous EOT problem. We fill this gap and propose a novel way to create pairs of probability distributions for which the ground truth OT solution is known by the construction. Our methodology is generic and works for a wide range of OT formulations, in particular, it covers the EOT which is equivalent to SB (the main interest of our study).


Schr\"odinger's FP: Dynamic Adaptation of Floating-Point Containers for Deep Learning Training

Nikolić, Miloš, Sanchez, Enrique Torres, Wang, Jiahui, Zadeh, Ali Hadi, Mahmoud, Mostafa, Abdelhadi, Ameer, Ibrahim, Kareem, Moshovos, Andreas

arXiv.org Artificial Intelligence

The transfer of tensors from/to memory during neural network training dominates time and energy. To improve energy efficiency and performance, research has been exploring ways to use narrower data representations. So far, these attempts relied on user-directed trial-and-error to achieve convergence. We present methods that relieve users from this responsibility. Our methods dynamically adjust the size and format of the floating-point containers used for activations and weights during training, achieving adaptivity across three dimensions: i) which datatype to use, ii) on which tensor, and iii) how it changes over time. The different meanings and distributions of exponent and mantissas lead us to tailored approaches for each. We present two lossy pairs of methods to eliminate as many mantissa and exponent bits as possible without affecting accuracy. Quantum Mantissa and Quantum Exponent are machine learning compression methods that tap into the gradient descent algorithm to learn the minimal mantissa and exponent bitlengths on a per-layer granularity. They automatically learn that many tensors can use just 1 or 2 mantissa bits and 3 or 4 exponent bits. Overall, the two machine learning methods reduce the footprint by $4.74\times$. Alternatively, BitWave observes changes in the loss function during training to adjust mantissa and exponent bitlengths network-wide, yielding a $3.19\times$ reduction in footprint. Finally, we present an optional method, Gecko, to exploit the naturally emerging, lop-sided exponent distribution to losslessly compress resulting exponents from Quantum Exponent or BitWave and, on average, improve compression rates to $5.64\times$ and $4.56\times$.


Generative Diffusion From An Action Principle

Premkumar, Akhil

arXiv.org Artificial Intelligence

The field of Generative Artificial Intelligence has witnessed remarkable progress in recent years, fueled by the advent of novel deep learning techniques. Among these advancements, diffusion-based models have emerged as a promising paradigm for generating high-quality, high-dimensional, diverse, and coherent data samples. These models leverage principles from non-equilibrium statistical mechanics to effectively reconstruct the underlying probability distribution from which a training data set was sampled. The central idea behind diffusion models is reverse diffusion. These models gradually add noise to a given data set and observe how the data vectors evolve over time.


Data Assimilation for Sign-indefinite Priors: A generalization of Sinkhorn's algorithm

Dong, Anqi, Georgiou, Tryphon T., Tannenbaum, Allen

arXiv.org Machine Learning

The purpose of this work is to develop a framework to calibrate signed datasets so as to be consistent with specified marginals by suitably extending the Schr\"odinger-Fortet-Sinkhorn paradigm. Specifically, we seek to revise sign-indefinite multi-dimensional arrays in a way that the updated values agree with specified marginals. Our approach follows the rationale in Schr\"odinger's problem, aimed at updating a "prior" probability measure to agree with marginal distributions. The celebrated Sinkhorn's algorithm (established earlier by R.\ Fortet) that solves Schr\"odinger's problem found early applications in calibrating contingency tables in statistics and, more recently, multi-marginal problems in machine learning and optimal transport. Herein, we postulate a sign-indefinite prior in the form of a multi-dimensional array, and propose an optimization problem to suitably update this prior to ensure consistency with given marginals. The resulting algorithm generalizes the Sinkhorn algorithm in that it amounts to iterative scaling of the entries of the array along different coordinate directions. The scaling is multiplicative but also, in contrast to Sinkhorn, inverse-multiplicative depending on the sign of the entries. Our algorithm reduces to the classical Sinkhorn algorithm when the entries of the prior are positive.


Machine learning for discovering laws of nature

Xin, Lizhi, Xin, Kevin, Xin, Houwen

arXiv.org Artificial Intelligence

Based on Darwin's natural selection, we developed "machine scientists" to discover the laws of nature by learning from raw data. "Machine scientists" construct physical theories by applying a logic tree (state Decision Tree) and a value tree (observation Function Tree); the logical tree determines the state of the entity, and the value tree determines the absolute value between the two observations of the entity. A logic Tree and a value tree together can reconstruct an entity's trajectory and make predictions about its future outcomes. Our proposed algorithmic model has an emphasis on machine learning - where "machine scientists" builds up its experience by being rewarded or punished for each decision they make - eventually leading to rediscovering Newton's equation (classical physics) and the Born's rule (quantum mechanics).


Hybrid Ground-State Quantum Algorithms based on Neural Schr\"odinger Forging

de Schoulepnikoff, Paulin, Kiss, Oriel, Vallecorsa, Sofia, Carleo, Giuseppe, Grossi, Michele

arXiv.org Artificial Intelligence

Entanglement forging based variational algorithms leverage the bi-partition of quantum systems for addressing ground state problems. The primary limitation of these approaches lies in the exponential summation required over the numerous potential basis states, or bitstrings, when performing the Schmidt decomposition of the whole system. To overcome this challenge, we propose a new method for entanglement forging employing generative neural networks to identify the most pertinent bitstrings, eliminating the need for the exponential sum. Through empirical demonstrations on systems of increasing complexity, we show that the proposed algorithm achieves comparable or superior performance compared to the existing standard implementation of entanglement forging. Moreover, by controlling the amount of required resources, this scheme can be applied to larger, as well as non permutation invariant systems, where the latter constraint is associated with the Heisenberg forging procedure. We substantiate our findings through numerical simulations conducted on spins models exhibiting one-dimensional ring, two-dimensional triangular lattice topologies, and nuclear shell model configurations.


Product Manager, Machine Learning Applications at Schrödinger - New York

#artificialintelligence

As a member of the Machine Learning team, you'll work with both methods researchers and small molecule designers to imagine and design user experiences to leverage machine learning methods. This position offers the opportunity to influence Schrödinger's business direction and scientific functionality by bridging gaps between technical, scientific, and commercial realms.


Quantum Complexity Tamed by Machine Learning

#artificialintelligence

In 2018, climate simulations were the third-largest use of computing cycles at a leading U.S. supercomputing cluster. The study of quarks and other subatomic particles came in second. Topping the list was the most heavily cited idea in the physical sciences -- though few have ever heard of it. "It's ridiculously important," said Kieron Burke, a theoretical chemist at the University of California, Irvine. Science's best-kept secret goes by the name of density functional theory (DFT), and it is the chief method physicists and chemists use to understand just about anything more complicated than a hydrogen atom.


AI Pharma Deals: Bayer and AI Startups

#artificialintelligence

So far, the pharmaceutical industry has contributed more to the well-being of humanity than any other industry. But lately its business model has been under significant pressure since the return on R&D investment has dropped to its lowest level in decades (lack of innovation amid digital disruption, rapid technological advances and other issues such as lack of data reproducibility) and its public reputation in US and around the world (anti vaccine movement in Europe) is worse than ever. This worrisome mix of little growth potential and low reputation is the main reason why investors are increasingly worried, not to mention that the current drug development process needs a big dose of digital innovation to deal with its messy data. As a matter of fact, Stefan Oelrich member of the Board Management of Bayer AG, President Pharmaceuticals, wrote in an article -- that the title perfectly summarises the AI pharma situation "Artificial Intelligence - When we Suddenly Know What we Don't Know" -- the following: "As we open the first doors in this unknown land we start to discover how much more is out there for our entire pharmaceutical value chain spanning from research to product supply. I expect AI to help us know what we have not known so far. Artificial Intelligence will become instrumental in our search for new medicines to better serve patients around the world as we leverage Science For A Better Life".