AITopics

2203.03021

Country:

North America > United States > Utah (0.04)
Europe > United Kingdom > England > Greater London > London (0.04)
Europe > Portugal > Lisbon > Lisbon (0.04)
Asia > Middle East > Jordan (0.04)

Genre:

Research Report (0.65)
Instructional Material (0.48)

Industry:

Education (0.68)
Transportation (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.30)

arXiv.org Machine LearningMar-6-2022

Cascaded Gaps: Towards Gap-Dependent Regret for Risk-Sensitive Reinforcement Learning

Fei, Yingjie, Xu, Ruitu

In this paper, we study gap-dependent regret guarantees for risk-sensitive reinforcement learning based on the entropic risk measure. We propose a novel definition of sub-optimality gaps, which we call cascaded gaps, and we discuss their key components that adapt to the underlying structures of the problem. Based on the cascaded gaps, we derive non-asymptotic and logarithmic regret bounds for two model-free algorithms under episodic Markov decision processes. We show that, in appropriate settings, these bounds feature exponential improvement over existing ones that are independent of gaps. We also prove gap-dependent lower bounds, which certify the near optimality of the upper bounds.

probability, risk-sensitive rl, sub-optimality gap, (15 more...)

2203.0311

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > New York (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.88)

#artificialintelligenceMar-1-2022, 20:39:46 GMT

Top resources to learn reinforcement learning in 2022

Rich S. Sutton, a research scientist at DeepMind and computing science professor at the University of Alberta, explains the underlying formal problem like the Markov decision processes, core solution methods, dynamic programming, Monte Carlo methods, and temporal-difference learning in this in-depth tutorial.

learn reinforcement, reinforcement, reinforcement learning, (10 more...)

Country:

North America > Canada > Alberta (0.57)
North America > United States > Massachusetts > Hampshire County > Amherst (0.06)

Genre: Instructional Material > Course Syllabus & Notes (1.00)

Industry:

Leisure & Entertainment > Games > Computer Games (0.51)
Information Technology (0.51)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.75)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.38)

#artificialintelligenceFeb-28-2022, 08:20:45 GMT

Hidden Markov Models Simply Explained

In a regular Markov Chain we are able to see the states and their associated transition probabilities. However, in a Hidden Markov Model (HMM), the Markov Chain is hidden but we can infer its properties through its given observed states. Note: The Hidden Markov Model is not a Markov Chain per se, it is another model in the wider list of Markov Processes/Models. These associated probabilities of the observed states (Happy, Sad) are known as the emission probabilities. Now, lets say my friend wants to infer the weather from my mood.

probability, sequence, stationary distribution, (10 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

#artificialintelligenceFeb-26-2022, 05:10:43 GMT

Analysis and Assessment of Controllability of an Expressive Deep Learning-Based TTS System

In this paper, we study the controllability of an Expressive TTS system trained on a dataset for a continuous control. The dataset is the Blizzard 2013 dataset based on audiobooks read by a female speaker containing a great variability in styles and expressiveness. Controllability is evaluated with both an objective and a subjective experiment. The objective assessment is based on a measure of correlation between acoustic features and the dimensions of the latent space representing expressiveness. The subjective assessment is based on a perceptual experiment in which users are shown an interface for Controllable Expressive TTS and asked to retrieve a synthetic utterance whose expressiveness subjectively corresponds to that a reference utterance.

representation, speech synthesis, synthesis, (12 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.32)

arXiv.org Machine LearningFeb-25-2022

Learning Dynamic Mechanisms in Unknown Environments: A Reinforcement Learning Approach

Lyu, Boxiang, Meng, Qinglin, Qiu, Shuang, Wang, Zhaoran, Yang, Zhuoran, Jordan, Michael I.

Dynamic mechanism design studies how mechanism designers should allocate resources among agents in a time-varying environment. We consider the problem where the agents interact with the mechanism designer according to an unknown Markov Decision Process (MDP), where agent rewards and the mechanism designer's state evolve according to an episodic MDP with unknown reward functions and transition kernels. We focus on the online setting with linear function approximation and attempt to recover the dynamic Vickrey-Clarke-Grove (VCG) mechanism over multiple rounds of interaction. A key contribution of our work is incorporating reward-free online Reinforcement Learning (RL) to aid exploration over a rich policy space to estimate prices in the dynamic VCG mechanism. We show that the regret of our proposed method is upper bounded by $\tilde{\mathcal{O}}(T^{2/3})$ and further devise a lower bound to show that our algorithm is efficient, incurring the same $\tilde{\mathcal{O}}(T^{2 / 3})$ regret as the lower bound, where $T$ is the total number of rounds. Our work establishes the regret guarantee for online RL in solving dynamic mechanism design problems without prior knowledge of the underlying model.

equation, mechanism, reward function, (14 more...)

2202.12797

Country:

Asia > Middle East > Jordan (0.04)
North America > United States > Illinois > Cook County > Chicago (0.04)
North America > United States > California > Alameda County > Berkeley (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Workflow (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.54)

Koyuncu, Deniz, Yener, Bülent

Missing Value Knockoffs

arXiv.org Machine LearningFeb-25-2022

Coping with increasing number of variables, optimizing predictive performance, and selecting among candidate scientific hypothesis are all valid reasons for using a variable selection algorithm. Another reality of today's datasets are missing values. Although there are existing methods for handling the missing values if applied directly, they can interfere with the assumptions of variable selection algorithms. In this work, we will discuss how model-x knockoffs (Candes et al. 2017), a new approach in principled variable selection, can be applied to datasets that contain missing values. By principled variable selection we refer to algorithms that aims to identify the Markov Blanket (MB) of a response variable (Tsamardinos and Aliferis 2003) while providing a control of the false selections. Identifying the MB is by definition optimal as the MB refers to the smallest subset of variables that is sufficient to describe the conditional distribution of the response variable. Controlling the false selections refers to limiting the variables that are selected due to random chance and is especially important in applications where a selected variable corresponds to a scientific discovery. Model-x knockoffs provides a framework for repurposing existing statistical/machine learning feature scorers for MB discovery. When the assumptions of the model-x framework holds, the expected fraction of selections that are conditionally pairwise independent with the response variable is controlled.

denote, imputation, knockoff, (14 more...)

2202.13054

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > New York > Rensselaer County > Troy (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)

Genre: Research Report > New Finding (0.68)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

arXiv.org Machine LearningFeb-24-2022

Bayesian Deep Learning for Graphs

Errica, Federico

The adaptive processing of structured data is a long-standing research topic in machine learning that investigates how to automatically learn a mapping from a structured input to outputs of various nature. Recently, there has been an increasing interest in the adaptive processing of graphs, which led to the development of different neural network-based methodologies. In this thesis, we take a different route and develop a Bayesian Deep Learning framework for graph learning. The dissertation begins with a review of the principles over which most of the methods in the field are built, followed by a study on graph classification reproducibility issues. We then proceed to bridge the basic ideas of deep learning for graphs with the Bayesian world, by building our deep architectures in an incremental fashion. This framework allows us to consider graphs with discrete and continuous edge features, producing unsupervised embeddings rich enough to reach the state of the art on several classification tasks. Our approach is also amenable to a Bayesian nonparametric extension that automatizes the choice of almost all model's hyper-parameters. Two real-world applications demonstrate the efficacy of deep learning for graphs. The first concerns the prediction of information-theoretic quantities for molecular simulations with supervised neural models. After that, we exploit our Bayesian models to solve a malware-classification task while being robust to intra-procedural code obfuscation techniques. We conclude the dissertation with an attempt to blend the best of the neural and Bayesian worlds together. The resulting hybrid model is able to predict multimodal distributions conditioned on input graphs, with the consequent ability to model stochasticity and uncertainty better than most works. Overall, we aim to provide a Bayesian perspective into the articulated research field of deep learning for graphs.

infinite contextual graph markov model, neighborhood aggregation mechanism, neighborhood aggregation scheme, (16 more...)

2202.12348

Country:

Europe > Denmark > Capital Region > Kongens Lyngby (0.13)
Asia > Middle East > Jordan (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(5 more...)

Genre:

Workflow (1.00)
Research Report > New Finding (1.00)
Overview (1.00)
Instructional Material (0.92)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Balletti, Marco, Piccialli, Veronica, Sudoso, Antonio M.

Mixed-Integer Nonlinear Programming for State-based Non-Intrusive Load Monitoring

arXiv.org Machine LearningFeb-22-2022

Energy disaggregation, known in the literature as Non-Intrusive Load Monitoring (NILM), is the task of inferring the energy consumption of each appliance given the aggregate signal recorded by a single smart meter. In this paper, we propose a novel two-stage optimization-based approach for energy disaggregation. In the first phase, a small training set consisting of disaggregated power profiles is used to estimate the parameters and the power states by solving a mixed integer programming problem. Once the model parameters are estimated, the energy disaggregation problem is formulated as a constrained binary quadratic optimization problem. We incorporate penalty terms that exploit prior knowledge on how the disaggregated traces are generated, and appliance-specific constraints characterizing the signature of different types of appliances operating simultaneously. Our approach is compared with existing optimization-based algorithms both on a synthetic dataset and on three real-world datasets. The proposed formulation is computationally efficient, able to disambiguate loads with similar consumption patterns, and successfully reconstruct the signatures of known appliances despite the presence of unmetered devices, thus overcoming the main drawbacks of the optimization-based methods available in the literature.

appliance, artificial intelligence, machine learning, (19 more...)

doi: 10.1109/TSG.2022.3152147

2106.09158

Country: Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)

Genre: Research Report (1.00)

Industry: Energy > Power Industry (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

#artificialintelligenceFeb-20-2022, 01:03:44 GMT

Natural Language Processing

Originally published on Towards AI the World's Leading AI and Technology News and Media Company. If you are building an AI-related product or service, we invite you to consider becoming an AI sponsor. At Towards AI, we help scale AI and technology startups. Let us help you unleash your technology to the masses. The recommendation systems (RS) are becoming an integral part of our daily lives. This means that we can obtain what we desire either through internet-accessible applications or on social media channels. Traditional views of the recommendation problem refer to it as a simple classification or prediction problem; however, recently new evidence indicates that it is essentially a sequential problem[1]. It can therefore be formulated as a Markov decision process (MDP) and reinforcement learning (RL) methods can be employed to resolve it [1]. RL algorithms play a crucial role as these algorithms are very advantageous to cope with the dynamic environment and large space [4]. Deep Reinforcement Learning (DRL), have enabled RL to be applied to the recommendation problem with massive states and action spaces. RL-based and DRL-based methods in a classified manner based on the specific RL algorithm, like Q-learning, SARSA, and REINFORCE, that is used to optimize the recommendation policy[2].

algorithm, recommendation, reinforcement learning, (12 more...)

Industry:

Media (1.00)
Health & Medicine > Therapeutic Area > Oncology (0.93)
Leisure & Entertainment (0.93)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.49)