AITopics | Coles, Patrick J.

Collaborating Authors

Coles, Patrick J.

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Scalable Thermodynamic Second-order Optimization

Donatella, Kaelan, Duffield, Samuel, Melanson, Denis, Aifer, Maxwell, Klett, Phoebe, Salegame, Rajath, Belateche, Zach, Crooks, Gavin, Martinez, Antonio J., Coles, Patrick J.

arXiv.org Artificial IntelligenceFeb-12-2025

Many hardware proposals have aimed to accelerate inference in AI workloads. Less attention has been paid to hardware acceleration of training, despite the enormous societal impact of rapid training of AI models. Physics-based computers, such as thermodynamic computers, offer an efficient means to solve key primitives in AI training algorithms. Optimizers that normally would be computationally out-of-reach (e.g., due to expensive matrix inversions) on digital hardware could be unlocked with physics-based hardware. In this work, we propose a scalable algorithm for employing thermodynamic computers to accelerate a popular second-order optimizer called Kronecker-factored approximate curvature (K-FAC). Our asymptotic complexity analysis predicts increasing advantage with our algorithm as $n$, the number of neurons per layer, increases. Numerical experiments show that even under significant quantization noise, the benefits of second-order optimization can be preserved. Finally, we predict substantial speedups for large-scale vision and graph problems based on realistic hardware characteristics.

artificial intelligence, machine learning, matrix, (19 more...)

arXiv.org Artificial Intelligence

2502.08603

Country: North America > United States (0.14)

Genre: Research Report (0.83)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Thermodynamic Bayesian Inference

Aifer, Maxwell, Duffield, Samuel, Donatella, Kaelan, Melanson, Denis, Klett, Phoebe, Belateche, Zach, Crooks, Gavin, Martinez, Antonio J., Coles, Patrick J.

arXiv.org Artificial IntelligenceOct-2-2024

A fully Bayesian treatment of complicated predictive models (such as deep neural networks) would enable rigorous uncertainty quantification and the automation of higher-level tasks including model selection. However, the intractability of sampling Bayesian posteriors over many parameters inhibits the use of Bayesian methods where they are most needed. Thermodynamic computing has emerged as a paradigm for accelerating operations used in machine learning, such as matrix inversion, and is based on the mapping of Langevin equations to the dynamics of noisy physical systems. Hence, it is natural to consider the implementation of Langevin sampling algorithms on thermodynamic devices. In this work we propose electronic analog devices that sample from Bayesian posteriors by realizing Langevin dynamics physically. Circuit designs are given for sampling the posterior of a Gaussian-Gaussian model and for Bayesian logistic regression, and are validated by simulations. It is shown, under reasonable assumptions, that the Bayesian posteriors for these models can be sampled in time scaling with $\ln(d)$, where $d$ is dimension. For the Gaussian-Gaussian model, the energy cost is shown to scale with $ d \ln(d)$. These results highlight the potential for fast, energy-efficient Bayesian inference using thermodynamic computing.

artificial intelligence, machine learning, posterior, (15 more...)

arXiv.org Artificial Intelligence

2410.01793

Country:

North America > United States > New York (0.14)
Asia > Japan > Honshū > Kantō (0.14)

Genre:

Research Report > New Finding (0.50)
Research Report > Experimental Study (0.36)

Industry:

Health & Medicine (0.46)
Semiconductors & Electronics (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Thermodynamic Natural Gradient Descent

Donatella, Kaelan, Duffield, Samuel, Aifer, Maxwell, Melanson, Denis, Crooks, Gavin, Coles, Patrick J.

arXiv.org Artificial IntelligenceMay-22-2024

Second-order training methods have better convergence properties than gradient descent but are rarely used in practice for large-scale training due to their computational overhead. This can be viewed as a hardware limitation (imposed by digital computers). Here we show that natural gradient descent (NGD), a second-order method, can have a similar computational complexity per iteration to a first-order method, when employing appropriate hardware. We present a new hybrid digital-analog algorithm for training neural networks that is equivalent to NGD in a certain parameter regime but avoids prohibitively costly linear system solves. Our algorithm exploits the thermodynamic properties of an analog system at equilibrium, and hence requires an analog thermodynamic computer. The training occurs in a hybrid digital-analog loop, where the gradient and Fisher information matrix (or any other positive semi-definite curvature matrix) are calculated at given time intervals while the analog dynamics take place. We numerically demonstrate the superiority of this approach over state-of-the-art digital first- and second-order training methods on classification tasks and language model fine-tuning tasks.

artificial intelligence, iteration, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2405.13817

Country: North America (0.14)

Genre: Research Report (0.42)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.91)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

A Review of Barren Plateaus in Variational Quantum Computing

Larocca, Martin, Thanasilp, Supanut, Wang, Samson, Sharma, Kunal, Biamonte, Jacob, Coles, Patrick J., Cincio, Lukasz, McClean, Jarrod R., Holmes, Zoë, Cerezo, M.

arXiv.org Machine LearningMay-1-2024

Variational quantum computing offers a flexible computational paradigm with applications in diverse areas. However, a key obstacle to realizing their potential is the Barren Plateau (BP) phenomenon. When a model exhibits a BP, its parameter optimization landscape becomes exponentially flat and featureless as the problem size increases. Importantly, all the moving pieces of an algorithm -- choices of ansatz, initial state, observable, loss function and hardware noise -- can lead to BPs when ill-suited. Due to the significant impact of BPs on trainability, researchers have dedicated considerable effort to develop theoretical and heuristic methods to understand and mitigate their effects. As a result, the study of BPs has become a thriving area of research, influencing and cross-fertilizing other fields such as quantum optimal control, tensor networks, and learning theory. This article provides a comprehensive review of the current understanding of the BP phenomenon.

artificial intelligence, machine learning, preprint arxiv, (16 more...)

arXiv.org Machine Learning

2405.00781

Country:

North America > United States > New York > New York County > New York City (0.14)
North America > United States > California > Alameda County > Berkeley (0.14)

Genre:

Research Report (1.00)
Overview (1.00)

Industry: Energy (1.00)

Technology:

Information Technology > Hardware (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Can Error Mitigation Improve Trainability of Noisy Variational Quantum Algorithms?

Wang, Samson, Czarnik, Piotr, Arrasmith, Andrew, Cerezo, M., Cincio, Lukasz, Coles, Patrick J.

arXiv.org Artificial IntelligenceMar-7-2024

Variational Quantum Algorithms (VQAs) are often viewed as the best hope for near-term quantum advantage. However, recent studies have shown that noise can severely limit the trainability of VQAs, e.g., by exponentially flattening the cost landscape and suppressing the magnitudes of cost gradients. Error Mitigation (EM) shows promise in reducing the impact of noise on near-term devices. Thus, it is natural to ask whether EM can improve the trainability of VQAs. In this work, we first show that, for a broad class of EM strategies, exponential cost concentration cannot be resolved without committing exponential resources elsewhere. This class of strategies includes as special cases Zero Noise Extrapolation, Virtual Distillation, Probabilistic Error Cancellation, and Clifford Data Regression. Second, we perform analytical and numerical analysis of these EM protocols, and we find that some of them (e.g., Virtual Distillation) can make it harder to resolve cost function values compared to running no EM at all. As a positive result, we do find numerical evidence that Clifford Data Regression (CDR) can aid the training process in certain settings where cost concentration is not too severe. Our results show that care should be taken in applying EM protocols as they can either worsen or not improve trainability. On the other hand, our positive results for CDR highlight the possibility of engineering error mitigation methods to improve trainability.

artificial intelligence, machine learning, resolvability, (16 more...)

arXiv.org Artificial Intelligence

doi: 10.22331/q-2024-03-14-1287

2109.01051

Country:

North America > United States (1.00)
Europe (0.67)

Genre: Research Report > New Finding (0.54)

Industry: Energy (0.67)

Technology:

Information Technology > Hardware (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.93)

Add feedback

Noise-Induced Barren Plateaus in Variational Quantum Algorithms

Wang, Samson, Fontana, Enrico, Cerezo, M., Sharma, Kunal, Sone, Akira, Cincio, Lukasz, Coles, Patrick J.

arXiv.org Artificial IntelligenceMar-1-2024

Variational Quantum Algorithms (VQAs) may be a path to quantum advantage on Noisy Intermediate-Scale Quantum (NISQ) computers. A natural question is whether noise on NISQ devices places fundamental limitations on VQA performance. We rigorously prove a serious limitation for noisy VQAs, in that the noise causes the training landscape to have a barren plateau (i.e., vanishing gradient). Specifically, for the local Pauli noise considered, we prove that the gradient vanishes exponentially in the number of qubits $n$ if the depth of the ansatz grows linearly with $n$. These noise-induced barren plateaus (NIBPs) are conceptually different from noise-free barren plateaus, which are linked to random parameter initialization. Our result is formulated for a generic ansatz that includes as special cases the Quantum Alternating Operator Ansatz and the Unitary Coupled Cluster Ansatz, among others. For the former, our numerical heuristics demonstrate the NIBP phenomenon for a realistic hardware noise model.

artificial intelligence, machine learning, noise, (18 more...)

arXiv.org Artificial Intelligence

doi: 10.1038/s41467-021-27045-6

2007.14384

Country: North America > United States (0.93)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Hardware (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Thermodynamic Computing System for AI Applications

Melanson, Denis, Khater, Mohammad Abu, Aifer, Maxwell, Donatella, Kaelan, Gordon, Max Hunter, Ahle, Thomas, Crooks, Gavin, Martinez, Antonio J., Sbahi, Faris, Coles, Patrick J.

arXiv.org Artificial IntelligenceDec-8-2023

Recent breakthroughs in artificial intelligence (AI) algorithms have highlighted the need for novel computing hardware in order to truly unlock the potential for AI. Physics-based hardware, such as thermodynamic computing, has the potential to provide a fast, low-power means to accelerate AI primitives, especially generative AI and probabilistic AI. In this work, we present the first continuous-variable thermodynamic computer, which we call the stochastic processing unit (SPU). Our SPU is composed of RLC circuits, as unit cells, on a printed circuit board, with 8 unit cells that are all-to-all coupled via switched capacitances. It can be used for either sampling or linear algebra primitives, and we demonstrate Gaussian sampling and matrix inversion on our hardware. The latter represents the first thermodynamic linear algebra experiment. We also illustrate the applicability of the SPU to uncertainty quantification for neural network classification. We envision that this hardware, when scaled up in size, will have significant impact on accelerating various probabilistic AI applications.

artificial intelligence, machine learning, matrix, (19 more...)

arXiv.org Artificial Intelligence

2312.04836

Country:

North America > United States > New York (0.14)
Asia > Japan > Honshū > Kantō (0.14)

Genre: Research Report (0.50)

Industry: Energy (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

Inference-Based Quantum Sensing

Alderete, C. Huerta, Gordon, Max Hunter, Sauvage, Frederic, Sone, Akira, Sornborger, Andrew T., Coles, Patrick J., Cerezo, M.

arXiv.org Artificial IntelligenceAug-4-2023

In a standard Quantum Sensing (QS) task one aims at estimating an unknown parameter $\theta$, encoded into an $n$-qubit probe state, via measurements of the system. The success of this task hinges on the ability to correlate changes in the parameter to changes in the system response $\mathcal{R}(\theta)$ (i.e., changes in the measurement outcomes). For simple cases the form of $\mathcal{R}(\theta)$ is known, but the same cannot be said for realistic scenarios, as no general closed-form expression exists. In this work we present an inference-based scheme for QS. We show that, for a general class of unitary families of encoding, $\mathcal{R}(\theta)$ can be fully characterized by only measuring the system response at $2n+1$ parameters. This allows us to infer the value of an unknown parameter given the measured response, as well as to determine the sensitivity of the scheme, which characterizes its overall performance. We show that inference error is, with high probability, smaller than $\delta$, if one measures the system response with a number of shots that scales only as $\Omega(\log^3(n)/\delta^2)$. Furthermore, the framework presented can be broadly applied as it remains valid for arbitrary probe states and measurement schemes, and, even holds in the presence of quantum noise. We also discuss how to extend our results beyond unitary families. Finally, to showcase our method we implement it for a QS task on real quantum hardware, and in numerical simulations.

artificial intelligence, machine learning, response function, (18 more...)

arXiv.org Artificial Intelligence

doi: 10.1103/PhysRevLett.129.190501

2206.09919

Country: North America > United States (1.00)

Genre: Research Report > New Finding (0.34)

Industry:

Energy (0.68)
Government > Regional Government (0.46)

Technology:

Information Technology > Hardware (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Resource frugal optimizer for quantum machine learning

Moussa, Charles, Gordon, Max Hunter, Baczyk, Michal, Cerezo, M., Cincio, Lukasz, Coles, Patrick J.

arXiv.org Artificial IntelligenceJul-28-2023

Quantum-enhanced data science, also known as quantum machine learning (QML), is of growing interest as an application of near-term quantum computers. Variational QML algorithms have the potential to solve practical problems on real hardware, particularly when involving quantum data. However, training these algorithms can be challenging and calls for tailored optimization procedures. Specifically, QML applications can require a large shot-count overhead due to the large datasets involved. In this work, we advocate for simultaneous random sampling over both the dataset as well as the measurement operators that define the loss function. We consider a highly general loss function that encompasses many QML applications, and we show how to construct an unbiased estimator of its gradient. This allows us to propose a shot-frugal gradient descent optimizer called Refoqus (REsource Frugal Optimizer for QUantum Stochastic gradient descent). Our numerics indicate that Refoqus can save several orders of magnitude in shot cost, even relative to optimizers that sample over measurement operators alone.

artificial intelligence, estimator, machine learning, (17 more...)

arXiv.org Artificial Intelligence

doi: 10.1088/2058-9565/acef55

2211.04965

Country:

Europe (1.00)
North America > United States > New Mexico (0.14)
North America > United States > New York (0.14)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.89)

Add feedback

Out-of-distribution generalization for learning quantum dynamics

Caro, Matthias C., Huang, Hsin-Yuan, Ezzell, Nicholas, Gibbs, Joe, Sornborger, Andrew T., Cincio, Lukasz, Coles, Patrick J., Holmes, Zoë

arXiv.org Artificial IntelligenceJul-9-2023

Generalization bounds are a critical tool to assess the training data requirements of Quantum Machine Learning (QML). Recent work has established guarantees for in-distribution generalization of quantum neural networks (QNNs), where training and testing data are drawn from the same data distribution. However, there are currently no results on out-of-distribution generalization in QML, where we require a trained model to perform well even on data drawn from a different distribution to the training distribution. Here, we prove out-of-distribution generalization for the task of learning an unknown unitary. In particular, we show that one can learn the action of a unitary on entangled states having trained only product states. Since product states can be prepared using only single-qubit gates, this advances the prospects of learning quantum dynamics on near term quantum hardware, and further opens up new methods for both the classical and quantum compilation of quantum circuits.

artificial intelligence, ensemble, machine learning, (17 more...)

arXiv.org Artificial Intelligence

doi: 10.1038/s41467-023-39381-w

2204.10268

Country:

Europe (1.00)
North America > United States > California > Los Angeles County > Los Angeles (0.14)

Genre: Research Report > New Finding (0.46)

Industry: Education (0.67)

Technology:

Information Technology > Hardware (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.93)

Add feedback