AITopics | visible state

Generative neural networks can produce data samples according to the statistical properties of their training distribution. This feature can be used to test modern computational neuroscience hypotheses suggesting that spontaneous brain activity is partially supported by top-down generative processing. A widely studied class of generative models is that of Restricted Boltzmann Machines (RBMs), which can be used as building blocks for unsupervised deep learning architectures. In this work, we systematically explore the generative dynamics of RBMs, characterizing the number of states visited during top-down sampling and investigating whether the heterogeneity of visited attractors could be increased by starting the generation process from biased hidden states. By considering an RBM trained on a classic dataset of handwritten digits, we show that the capacity to produce diverse data prototypes can be increased by initiating top-down sampling from chimera states, which encode high-level visual features of multiple digits. We also found that the model is not capable of transitioning between all possible digit states within a single generation trajectory, suggesting that the top-down dynamics is heavily constrained by the shape of the energy function.

artificial intelligence, digit, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2305.06745

Country:

Europe > Italy (0.05)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)

Genre: Research Report (1.00)

Industry: Health & Medicine > Therapeutic Area > Neurology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.35)

Add feedback

Generative and discriminative training of Boltzmann machine through Quantum annealing

Srivastava, Siddhartha, Sundararaghavan, Veera

arXiv.org Artificial IntelligenceJul-19-2022

A hybrid quantum-classical method for learning Boltzmann machines (BM) for a generative and discriminative task is presented. Boltzmann machines are undirected graphs with a network of visible and hidden nodes where the former is used as the reading site while the latter is used to manipulate visible states' probability. In Generative BM, the samples of visible data imitate the probability distribution of a given data set. In contrast, the visible sites of discriminative BM are treated as Input/Output (I/O) reading sites where the conditional probability of output state is optimized for a given set of input states. The cost function for learning BM is defined as a weighted sum of Kullback-Leibler (KL) divergence and Negative conditional Log-Likelihood (NCLL), adjusted using a hyperparamter. Here, the KL Divergence is the cost for generative learning, and NCLL is the cost for discriminative learning. A Stochastic Newton-Raphson optimization scheme is presented. The gradients and the Hessians are approximated using direct samples of BM obtained through Quantum annealing (QA). Quantum annealers are hardware representing the physics of the Ising model that operates on low but finite temperature. This temperature affects the probability distribution of the BM; however, its value is unknown. Previous efforts have focused on estimating this unknown temperature through regression of theoretical Boltzmann energies of sampled states with the probability of states sampled by the actual hardware. This assumes that the control parameter change does not affect the system temperature, however, this is not usually the case. Instead, an approach that works on the probability distribution of samples, instead of the energies, is proposed to estimate the optimal parameter set. This ensures that the optimal set can be obtained from a single run.

artificial intelligence, boltzmann machine, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2002.00792

Country: North America > United States > Michigan > Washtenaw County > Ann Arbor (0.04)

Genre: Research Report > New Finding (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Discovering Sparse Interpretable Dynamics from Partial Observations

Lu, Peter Y., Ariño, Joan, Soljačić, Marin

arXiv.org Artificial IntelligenceDec-15-2021

Identifying the governing equations of a nonlinear dynamical system is key to both understanding the physical features of the system and constructing an accurate model of the dynamics that generalizes well beyond the available data. We propose a machine learning framework for discovering these governing equations using only partial observations, combining an encoder for state reconstruction with a sparse symbolic model. Our tests show that this method can successfully reconstruct the full system state and identify the underlying dynamics for a variety of ODE and PDE systems.

equation, reconstruction, symbolic model, (14 more...)

arXiv.org Artificial Intelligence

doi: 10.1038/s42005-022-00987-z

2107.10879

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)

Genre: Research Report (0.50)

Industry:

Government > Regional Government > North America Government > United States Government (1.00)
Government > Military (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.94)

Add feedback

Convolutional Bipartite Attractor Networks

Iuzzolino, Michael, Singer, Yoram, Mozer, Michael C.

arXiv.org Machine LearningJun-14-2019

In human perception and cognition, the fundamental operation that brains perform is interpretation: constructing coherent neural states from noisy, incomplete, and intrinsically ambiguous evidence. The problem of interpretation is well matched to an early and often overlooked architecture, the attractor network---a recurrent neural network that performs constraint satisfaction, imputation of missing features, and clean up of noisy data via energy minimization dynamics. We revisit attractor nets in light of modern deep learning methods, and propose a convolutional bipartite architecture with a novel training loss, activation function, and connectivity constraints. We tackle problems much larger than have been previously explored with attractor nets and demonstrate their potential for image denoising, completion, and super-resolution. We argue that this architecture is better motivated than ever-deeper feedforward models and is a viable alternative to more costly sampling-based methods on a range of supervised and unsupervised tasks.

artificial intelligence, deep learning, machine learning, (19 more...)

arXiv.org Machine Learning

1906.03504

Country:

North America > United States (1.00)
Europe > United Kingdom > England (0.46)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

From Monte Carlo to Las Vegas: Improving Restricted Boltzmann Machine Training Through Stopping Sets

Savarese, Pedro H. P. (Toyota Technical Institute at Chicago) | Kakodkar, Mayank (Purdue University, West Lafayette, IN) | Ribeiro, Bruno (Purdue University, West Lafayette, IN)

AAAI ConferencesFeb-8-2018

We propose a Las Vegas transformation of Markov Chain Monte Carlo (MCMC) estimators of Restricted Boltzmann Machines (RBMs). We denote our approach Markov Chain Las Vegas (MCLV). MCLV gives statistical guarantees in exchange for random running times. MCLV uses a stopping set built from the training data and has maximum number of Markov chain steps K (referred as MCLV-K). We present a MCLV-K gradient estimator (LVS-K) for RBMs and explore the correspondence and differences between LVS-K and Contrastive Divergence (CD-K), with LVS-K significantly outperforming CD-K training RBMs over the MNIST dataset, indicating MCLV to be a promising direction in learning generative models.

markov chain, rbm, tours, (16 more...)

AAAI Conferences

Thirty-Second AAAI Conference on Artificial Intelligence

Country:

North America > United States > Nevada > Clark County > Las Vegas (0.82)
North America > United States > Indiana > Tippecanoe County > West Lafayette (0.04)
North America > United States > Indiana > Tippecanoe County > Lafayette (0.04)
(2 more...)

Genre: Research Report (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Add feedback

From Monte Carlo to Las Vegas: Improving Restricted Boltzmann Machine Training Through Stopping Sets

Savarese, Pedro H. P., Kakodkar, Mayank, Ribeiro, Bruno

arXiv.org Machine LearningNov-22-2017

We propose a Las Vegas transformation of Markov Chain Monte Carlo (MCMC) estimators of Restricted Boltzmann Machines (RBMs). We denote our approach Markov Chain Las Vegas (MCLV). MCLV gives statistical guarantees in exchange for random running times. MCLV uses a stopping set built from the training data and has maximum number of Markov chain steps K (referred as MCLV-K). We present a MCLV-K gradient estimator (LVS-K) for RBMs and explore the correspondence and differences between LVS-K and Contrastive Divergence (CD-K), with LVS-K significantly outperforming CD-K training RBMs over the MNIST dataset, indicating MCLV to be a promising direction in learning generative models.

artificial intelligence, deep learning, machine learning, (19 more...)

arXiv.org Machine Learning

1711.08442

Country: North America > United States > Nevada > Clark County > Las Vegas (0.82)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Add feedback

(Yet) Another Theoretical Model of Thinking

Virie, Patrick

arXiv.org Artificial IntelligenceApr-17-2017

This paper presents a theoretical, idealized model of the thinking process with the following characteristics: 1) the model can produce complex thought sequences and can be generalized to new inputs, 2) it can receive and maintain input information indefinitely for the generation of thoughts and later use, and 3) it supports learning while executing. The crux of the model lies within the concept of internal consistency, or the generated thoughts should always be consistent with the inputs from which they are created. Its merit, apart from the capability to generate new creative thoughts from an internal mechanism, depends on the potential to help training to generalize better. This is consequently enabled by separating input information into several parts to be handled by different processing components with a focus mechanism to fetch information for each. This modularized view with the focus binds the model with the computationally capable Turing machines. And as a final remark, this paper constructively shows that the computational complexity of the model is at least, if not surpass, that of a universal Turing machine.

artificial intelligence, machine learning, visible state, (19 more...)

arXiv.org Artificial Intelligence

1511.02455

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.67)

Add feedback

Modeling Human Motion Using Binary Latent Variables

Taylor, Graham W., Hinton, Geoffrey E., Roweis, Sam T.

Neural Information Processing SystemsDec-31-2007

We propose a nonlinear generative model for human motion data that uses an undirected model with binary latent variables and real-valued "visible" variables that represent joint angles. The latent and visible variables at each time step receive directed connections from the visible variables at the last few time-steps. Such an architecture makes online inference efficient and allows us to use a simple approximate learning procedure. After training, the model finds a single set of parameters that simultaneously capture several different kinds of motion. We demonstrate the power of our approach by synthesizing various motion sequences and by performing online filling in of data lost during motion capture.

sequence, transition, visible unit, (16 more...)

Neural Information Processing Systems

Country:

North America > Canada > Ontario > Toronto (0.15)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.68)

Add feedback

Modeling Human Motion Using Binary Latent Variables

Taylor, Graham W., Hinton, Geoffrey E., Roweis, Sam T.

Neural Information Processing SystemsDec-31-2007

We propose a nonlinear generative model for human motion data that uses an undirected model with binary latent variables and real-valued "visible" variables that represent joint angles. The latent and visible variables at each time step receive directed connections from the visible variables at the last few time-steps. Such an architecture makes online inference efficient and allows us to use a simple approximate learning procedure. After training, the model finds a single set of parameters that simultaneously capture several different kinds of motion. We demonstrate the power of our approach by synthesizing various motion sequences and by performing online filling in of data lost during motion capture.

sequence, transition, visible unit, (16 more...)

Neural Information Processing Systems

Country:

North America > Canada > Ontario > Toronto (0.15)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.68)

Add feedback

Modeling Human Motion Using Binary Latent Variables

Taylor, Graham W., Hinton, Geoffrey E., Roweis, Sam T.

Neural Information Processing SystemsDec-31-2007

The latent and Visible variables at each time step receive directed connections from the Visible variables at the last few time-steps. Such an architecture makes online inference efficient and allows us to use a simple approximate learning procedure.

artificial intelligence, machine learning, sequence, (17 more...)

Neural Information Processing Systems

Country: