AITopics | tensor network model

Collaborating Authors

tensor network model

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Lower and Upper Bounds on the Pseudo-Dimension of Tensor Network Models

Neural Information Processing SystemsDec-24-2025, 04:16:51 GMT

Tensor network methods have been a key ingredient of advances in condensed matter physics and have recently sparked interest in the machine learning community for their ability to compactly represent very high-dimensional objects. Tensor network methods can for example be used to efficiently learn linear models in exponentially large feature spaces [Stoudenmire and Schwab, 2016]. In this work, we derive upper and lower bounds on the VC dimension and pseudo-dimension of a large class of tensor network models for classification, regression and completion. Our upper bounds hold for linear models parameterized by arbitrary tensor network structures, and we derive lower bounds for common tensor decomposition models~(CP, Tensor Train, Tensor Ring and Tucker) showing the tightness of our general upper bound. These results are used to derive a generalization bound which can be applied to classification with low rank matrices as well as linear classifiers based on any of the commonly used tensor decomposition models. As a corollary of our results, we obtain a bound on the VC dimension of the matrix product state classifier introduced in [Stoudenmire and Schwab, 2016] as a function of the so-called bond dimension~(i.e.

lower and upper bound, name change, pseudo-dimension, (8 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.82)

Add feedback

Lower and Upper Bounds on the Pseudo-Dimension of Tensor Network Models

Neural Information Processing SystemsOct-10-2024, 14:10:58 GMT

Tensor network methods have been a key ingredient of advances in condensed matter physics and have recently sparked interest in the machine learning community for their ability to compactly represent very high-dimensional objects. Tensor network methods can for example be used to efficiently learn linear models in exponentially large feature spaces [Stoudenmire and Schwab, 2016]. In this work, we derive upper and lower bounds on the VC dimension and pseudo-dimension of a large class of tensor network models for classification, regression and completion. Our upper bounds hold for linear models parameterized by arbitrary tensor network structures, and we derive lower bounds for common tensor decomposition models (CP, Tensor Train, Tensor Ring and Tucker) showing the tightness of our general upper bound. These results are used to derive a generalization bound which can be applied to classification with low rank matrices as well as linear classifiers based on any of the commonly used tensor decomposition models.

lower and upper bound, pseudo-dimension, tensor network model, (6 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.86)

Add feedback

Entangling Machine Learning with Quantum Tensor Networks

van der Poel, Constantijn, Zhao, Dan

arXiv.org Artificial IntelligenceJan-8-2024

This paper examines the use of tensor networks, which can efficiently represent high-dimensional quantum states, in language modeling. It is a distillation and continuation of the work done in (van der Poel, 2023). To do so, we will abstract the problem down to modeling Motzkin spin chains, which exhibit long-range correlations reminiscent of those found in language. The Matrix Product State (MPS), also known as the tensor train, has a bond dimension which scales as the length of the sequence it models. To combat this, we use the factored core MPS, whose bond dimension scales sub-linearly. We find that the tensor models reach near perfect classifying ability, and maintain a stable level of performance as the number of valid training examples is decreased.

dimension, sequence, tensor network, (14 more...)

arXiv.org Artificial Intelligence

2403.12969

Country: Asia > Middle East > Israel (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.68)

Add feedback

Interaction Decompositions for Tensor Network Regression

Convy, Ian, Whaley, K. Birgitta

arXiv.org Artificial IntelligenceJan-25-2023

Tensor network regression has emerged as a promising and active area of machine learning research, having achieved impressive results on common benchmark tasks such as the Movie 100K [1], MNIST [2][3][4][5], and Fashion MNIST [3][4][5] datasets. The effectiveness of these models can be attributed to the tensor-product transformation that is applied to the data features, which maps the original feature vector into an exponentially large vector space. By performing linear operations on this expanded feature space, tensor network models are able to generate regression outputs that are highly non-linear functions of the original features. In most tensor network models, the tensor-product transformation is constructed from a set of vector-valued functions that each act on only a single data feature. The form of these functions is important to the operation of the model, as it determines how regression on the transformed space is related to regression on the original feature space. Conventional wisdom regarding the choice of these functions can be traced back to the parallel works of Stoudenmire and Schwab [2] and Novikov et al. [1], who each proposed a different transformation scheme.

artificial intelligence, machine learning, tensor, (18 more...)

arXiv.org Artificial Intelligence

doi: 10.1088/2632-2153/aca271

2208.06029

Country:

North America > United States > California > Alameda County > Berkeley (0.14)
North America > United States > New York (0.04)
North America > United States > Massachusetts > Suffolk County > Boston (0.04)
(2 more...)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning > Representation Of Examples (0.54)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

The Bayesian Method of Tensor Networks

Guo, Erdong, Draper, David

arXiv.org Machine LearningJan-1-2021

Bayesian learning is a powerful learning framework which combines the external information of the data (background information) with the internal information (training data) in a logically consistent way in inference and prediction. By Bayes rule, the external information (prior distribution) and the internal information (training data likelihood) are combined coherently, and the posterior distribution and the posterior predictive (marginal) distribution obtained by Bayes rule summarize the total information needed in the inference and prediction, respectively. In this paper, we study the Bayesian framework of the Tensor Network from two perspective. First, we introduce the prior distribution to the weights in the Tensor Network and predict the labels of the new observations by the posterior predictive (marginal) distribution. Since the intractability of the parameter integral in the normalization constant computation, we approximate the posterior predictive distribution by Laplace approximation and obtain the out-product approximation of the hessian matrix of the posterior distribution of the Tensor Network model. Second, to estimate the parameters of the stationary mode, we propose a stable initialization trick to accelerate the inference process by which the Tensor Network can converge to the stationary path more efficiently and stably with gradient descent method. We verify our work on the MNIST, Phishing Website and Breast Cancer data set. We study the Bayesian properties of the Bayesian Tensor Network by visualizing the parameters of the model and the decision boundaries in the two dimensional synthetic data set. For a application purpose, our work can reduce the overfitting and improve the performance of normal Tensor Network model.

artificial intelligence, machine learning, tensor network, (17 more...)

arXiv.org Machine Learning

2101.00245

Country:

North America > United States > Wisconsin (0.05)
North America > United States > California > Santa Cruz County > Santa Cruz (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.64)

Industry: Health & Medicine > Therapeutic Area > Oncology (0.57)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Entanglement and Tensor Networks for Supervised Image Classification

Martyn, John, Vidal, Guifre, Roberts, Chase, Leichenauer, Stefan

arXiv.org Machine LearningJul-12-2020

Tensor networks, originally designed to address computational problems in quantum many-body physics, have recently been applied to machine learning tasks. However, compared to quantum physics, where the reasons for the success of tensor network approaches over the last 30 years is well understood, very little is yet known about why these techniques work for machine learning. The goal of this paper is to investigate entanglement properties of tensor network models in a current machine learning application, in order to uncover general principles that may guide future developments. We revisit the use of tensor networks for supervised image classification using the MNIST data set of handwritten digits, as pioneered by Stoudenmire and Schwab [Adv. in Neur. Inform. Proc. Sys. 29, 4799 (2016)]. Firstly we hypothesize about which state the tensor network might be learning during training. For that purpose, we propose a plausible candidate state $|\Sigma_{\ell}\rangle$ (built as a superposition of product states corresponding to images in the training set) and investigate its entanglement properties. We conclude that $|\Sigma_{\ell}\rangle$ is so robustly entangled that it cannot be approximated by the tensor network used in that work, which must therefore be representing a very different state. Secondly, we use tensor networks with a block product structure, in which entanglement is restricted within small blocks of $n \times n$ pixels/qubits. We find that these states are extremely expressive (e.g. training accuracy of $99.97 \%$ already for $n=2$), suggesting that long-range entanglement may not be essential for image classification. However, in our current implementation, optimization leads to over-fitting, resulting in test accuracies that are not competitive with other current approaches.

artificial intelligence, machine learning, tensor network, (16 more...)

arXiv.org Machine Learning

2007.06082

Country: North America > United States > California > Santa Clara County > Mountain View (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Vision > Image Understanding (0.91)

Add feedback

Tensor network approaches for learning non-linear dynamical laws

Goeßmann, A., Götte, M., Roth, I., Sweke, R., Kutyniok, G., Eisert, J.

arXiv.org Machine LearningFeb-27-2020

Given observations of a physical system, identifying the underlying non-linear governing equation is a fundamental task, necessary both for gaining understanding and generating deterministic future predictions. Of most practical relevance are automated approaches to theory building that scale efficiently for complex systems with many degrees of freedom. To date, available scalable methods aim at a data-driven interpolation, without exploiting or offering insight into fundamental underlying physical principles, such as locality of interactions. In this work, we show that various physical constraints can be captured via tensor network based parameterizations for the governing equation, which naturally ensures scalability. In addition to providing analytic results motivating the use of such models for realistic physical systems, we demonstrate that efficient rank-adaptive optimization algorithms can be used to learn optimal tensor network models without requiring a~priori knowledge of the exact tensor ranks. As such, we provide a physics-informed approach to recovering structured dynamical laws from data, which adaptively balances the need for expressivity and scalability.

artificial intelligence, equation, machine learning, (17 more...)

arXiv.org Machine Learning

2002.12388

Country:

North America > United States > New Mexico > Los Alamos County > Los Alamos (0.04)
Europe > Germany > Berlin (0.04)
North America > United States > New York (0.04)
North America > United States > California (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.82)

Add feedback