AITopics

2205.02561

Country: Asia > China > Anhui Province > Hefei (0.04)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Can RBMs be trained with zero step contrastive divergence?

Fisher, Charles K.

Unlearn.AI, Inc., 75 Hawthorne St. Ste 560, San Francisco, CA 94105 (Dated: November 7, 2022) Restricted Boltzmann Machines (RBMs) are probabilistic generative models that can be trained by maximum likelihood in principle, but are usually trained by an approximate algorithm called Contrastive Divergence (CD) in practice. In general, a CD-k algorithm estimates an average with respect to the model distribution using a sample obtained from a k-step Markov Chain Monte Carlo Algorithm (e.g., block Gibbs sampling) starting from some initial configuration. Choices of k typically vary from 1 to 100. This technical report explores if it's possible to leverage a simple approximate sampling algorithm with a modified version of CD in order to train an RBM with k=0. As usual, the method is illustrated on MNIST.

artificial intelligence, machine learning, rbm, (18 more...)

2211.02174

Country: North America > United States > California > San Francisco County > San Francisco (0.54)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.74)
(2 more...)

Müller, Johannes, Montúfar, Guido

Geometry and convergence of natural policy gradient methods

We study the convergence of several natural policy gradient (NPG) methods in infinite-horizon discounted Markov decision processes with regular policy parametrizations. For a variety of NPGs and reward functions we show that the trajectories in state-action space are solutions of gradient flows with respect to Hessian geometries, based on which we obtain global convergence guarantees and convergence rates. In particular, we show linear convergence for unregularized and regularized NPG flows with the metrics proposed by Kakade and Morimura and co-authors by observing that these arise from the Hessian geometries of conditional entropy and entropy respectively. Further, we obtain sublinear convergence rates for Hessian geometries arising from other convex functions like log-barriers. Finally, we interpret the discrete-time NPG methods with regularized rewards as inexact Newton methods if the NPG is defined with respect to the Hessian geometry of the regularizer. This yields local quadratic convergence rates of these methods for step size equal to the penalization strength.

artificial intelligence, geometry, machine learning, (16 more...)

2211.02105

Country:

Asia > Middle East > Lebanon (0.04)
Europe > Germany > Saxony > Leipzig (0.04)
North America > United States > Rhode Island > Providence County > Providence (0.04)
(5 more...)

Genre: Research Report (1.00)

Industry: Leisure & Entertainment (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Off-Policy Confidence Interval Estimation with Confounded Markov Decision Process

Shi, Chengchun, Zhu, Jin, Shen, Ye, Luo, Shikai, Zhu, Hongtu, Song, Rui

This paper is concerned with constructing a confidence interval for a target policy's value offline based on a pre-collected observational data in infinite horizon settings. Most of the existing works assume no unmeasured variables exist that confound the observed actions. This assumption, however, is likely to be violated in real applications such as healthcare and technological industries. In this paper, we show that with some auxiliary variables that mediate the effect of actions on the system dynamics, the target policy's value is identifiable in a confounded Markov decision process. Based on this result, we develop an efficient off-policy value estimator that is robust to potential model misspecification and provide rigorous uncertainty quantification. Our method is justified by theoretical results, simulated and real datasets obtained from ridesharing companies. A Python implementation of the proposed procedure is available at https://github.com/Mamba413/cope.

artificial intelligence, machine learning, reinforcement learning, (18 more...)

2202.10589

Country:

North America > United States > North Carolina (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Japan > Honshū > Kantō > Kanagawa Prefecture (0.04)

Genre: Research Report (1.00)

Industry:

Transportation > Passenger (1.00)
Transportation > Ground > Road (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
(2 more...)

Hernandez-Olivan, Carlos, Hernandez-Olivan, Javier, Beltran, Jose R.

A Survey on Artificial Intelligence for Music Generation: Agents, Domains and Perspectives

Music is one of the Gardner's intelligences in his theory of multiple intelligences. How humans perceive and understand music is still being studied and is crucial to develop artificial intelligence models that imitate such processes. Music generation with Artificial Intelligence is an emerging field that is gaining much attention in the recent years. In this paper, we describe how humans compose music and how new AI systems could imitate such process by comparing past and recent advances in the field with music composition techniques. To understand how AI models and algorithms generate music and the potential applications that might appear in the future, we explore, analyze and describe the agents that take part of the music generation process: the datasets, models, interfaces, the users and the generated music. We mention possible applications that might benefit from this field and we also propose new trends and future research directions that could be explored in the future.

large language model, machine learning, natural language, (20 more...)

2210.13944

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > California > San Francisco County > San Francisco (0.14)
(32 more...)

Genre: Research Report (1.00)

Industry:

Media > Music (1.00)
Leisure & Entertainment (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.93)
(3 more...)

Nonparametric Involutive Markov Chain Monte Carlo

Mak, Carol, Zaiser, Fabian, Ong, Luke

A challenging problem in probabilistic programming is to develop inference algorithms that work for arbitrary programs in a universal probabilistic programming language (PPL). We present the nonparametric involutive Markov chain Monte Carlo (NP-iMCMC) algorithm as a method for constructing MCMC inference algorithms for nonparametric models expressible in universal PPLs. Building on the unifying involutive MCMC framework, and by providing a general procedure for driving state movement between dimensions, we show that NP-iMCMC can generalise numerous existing iMCMC algorithms to work on nonparametric models. We prove the correctness of the NP-iMCMC sampler. Our empirical study shows that the existing strengths of several iMCMC algorithms carry over to their nonparametric extensions. Applying our method to the recently proposed Nonparametric HMC, an instance of (Multiple Step) NP-iMCMC, we have constructed several nonparametric extensions (all of which new) that exhibit significant performance improvements.

artificial intelligence, machine learning, sampler, (17 more...)

2211.011

Country:

North America > United States > New York (0.04)
North America > United States > Maryland > Baltimore (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)

Genre: Research Report (0.49)

Industry: Energy (0.86)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.62)

Evmorfos, Spilios, Petropulu, Athina P., Poor, H. Vincent

Deep Reinforcement Learning for IRS Phase Shift Design in Spatiotemporally Correlated Environments

The paper studies the problem of designing the Intelligent Reflecting Surface (IRS) phase shifters for Multiple Input Single Output (MISO) communication systems in spatiotemporally correlated channel environments, where the destination can move within a confined area. The objective is to maximize the expected sum of SNRs at the receiver over infinite time horizons. The problem formulation gives rise to a Markov Decision Process (MDP). We propose a deep actor-critic algorithm that accounts for channel correlations and destination motion by constructing the state representation to include the current position of the receiver and the phase shift values and receiver positions that correspond to a window of previous time steps. The channel variability induces high frequency components on the spectrum of the underlying value function. We propose the preprocessing of the critic's input with a Fourier kernel which enables stable value learning. Finally, we investigate the use of the destination SNR as a component of the designed MDP state, which is common practice in previous work. We provide empirical evidence that, when the channels are spatiotemporally correlated, the inclusion of the SNR in the state representation interacts with function approximation in ways that inhibit convergence.

machine learning, reinforcement learning, time step, (12 more...)

2211.09726

Country:

North America > United States > New Jersey > Middlesex County > Piscataway (0.14)
North America > United States > New Jersey > Mercer County > Princeton (0.04)
Asia > China (0.04)

Genre: Research Report (0.40)

Industry:

Government > Tax (0.75)
Government > Regional Government > North America Government > United States Government (0.75)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.54)

Jahanshahi, Hadi, Cevik, Mucahit, Mousavi, Kianoush, Başar, Ayşe

ADPTriage: Approximate Dynamic Programming for Bug Triage

Bug triaging is a critical task in any software development project. It entails triagers going over a list of open bugs, deciding whether each is required to be addressed, and, if so, which developer should fix it. However, the manual bug assignment in issue tracking systems (ITS) offers only a limited solution and might easily fail when triagers must handle a large number of bug reports. During the automated assignment, there are multiple sources of uncertainties in the ITS, which should be addressed meticulously. In this study, we develop a Markov decision process (MDP) model for an online bug triage task. In addition to an optimization-based myopic technique, we provide an ADP-based bug triage solution, called ADPTriage, which has the ability to reflect the downstream uncertainty in the bug arrivals and developers' timetables. Specifically, without placing any limits on the underlying stochastic process, this technique enables real-time decision-making on bug assignments while taking into consideration developers' expertise, bug type, and bug fixing time. Our result shows a significant improvement over the myopic approach in terms of assignment accuracy and fixing time. We also demonstrate the empirical convergence of the model and conduct sensitivity analysis with various model parameters. Accordingly, this work constitutes a significant step forward in addressing the uncertainty in bug triage solutions

machine learning, natural language, reinforcement learning, (22 more...)

2211.00872

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > New York > New York County > New York City (0.04)
North America > Canada > Ontario > Toronto (0.04)
(3 more...)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
(2 more...)

A survey on the development status and application prospects of knowledge graph in smart grids

Wang, Jian, Wang, Xi, Ma, Chaoqun, Kou, Lei

With the advent of the electric power big data era, semantic interoperability and interconnection of power data have received extensive attention. Knowledge graph technology is a new method describing the complex relationships between concepts and entities in the objective world, which is widely concerned because of its robust knowledge inference ability. Especially with the proliferation of measurement devices and exponential growth of electric power data empowers, electric power knowledge graph provides new opportunities to solve the contradictions between the massive power resources and the continuously increasing demands for intelligent applications. In an attempt to fulfil the potential of knowledge graph and deal with the various challenges faced, as well as to obtain insights to achieve business applications of smart grids, this work first presents a holistic study of knowledge-driven intelligent application integration. Specifically, a detailed overview of electric power knowledge mining is provided. Then, the overview of the knowledge graph in smart grids is introduced. Moreover, the architecture of the big knowledge graph platform for smart grids and critical technologies are described. Furthermore, this paper comprehensively elaborates on the application prospects leveraged by knowledge graph oriented to smart grids, power consumer service, decision-making in dispatching, and operation and maintenance of power equipment. Finally, issues and challenges are summarised.

artificial intelligence, data mining, machine learning, (18 more...)

doi: 10.1049/gtd2.12040

2211.00901

Country:

North America > United States > New York (0.04)
North America > United States > Georgia > Fulton County > Atlanta (0.04)
Asia > Japan (0.04)
(2 more...)

Genre:

Research Report (0.81)
Overview (0.67)

Industry: Energy > Power Industry > Utilities (0.93)

Technology:

Information Technology > Data Science > Data Mining > Big Data (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Semantic Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.45)

Golmakani, Ali, Sadeghi, Mostafa, Serizel, Romain

Audio-visual speech enhancement with a deep Kalman filter generative model

Deep latent variable generative models based on variational autoencoder (VAE) have shown promising performance for audiovisual speech enhancement (AVSE). The underlying idea is to learn a VAEbased audiovisual prior distribution for clean speech data, and then combine it with a statistical noise model to recover a speech signal from a noisy audio recording and video (lip images) of the target speaker. Existing generative models developed for AVSE do not take into account the sequential nature of speech data, which prevents them from fully incorporating the power of visual data. In this paper, we present an audiovisual deep Kalman filter (AV-DKF) generative model which assumes a first-order Markov chain model for the latent variables and effectively fuses audiovisual data. Moreover, we develop an efficient inference methodology to estimate speech signals at test time. We conduct a set of experiments to compare different variants of generative models for speech enhancement. The results demonstrate the superiority of the AV-DKF model compared with both its audio-only version and the non-sequential audio-only and audiovisual VAE-based models.

artificial intelligence, machine learning, natural language, (16 more...)

2211.00988

Country: Europe > France > Grand Est > Meurthe-et-Moselle > Nancy (0.04)

Genre: Research Report > New Finding (0.48)

Industry: Media (0.48)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Generation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.34)