AITopics

Armstrong, Stuart, O'Rourke, Xavier

'Indifference' methods for managing agent rewards

arXiv.org Artificial IntelligenceJun-5-2018

'Indifference' refers to a class of methods used to control reward based agents. Indifference techniques aim to achieve one or more of three distinct goals: rewards dependent on certain events (without the agent being motivated to manipulate the probability of those events), effective disbelief (where agents behave as if particular events could never happen), and seamless transition from one reward function to another (with the agent acting as if this change is unanticipated). This paper presents several methods for achieving these goals in the POMDP setting, establishing their uses, strengths, and requirements. These methods of control work even when the implications of the agent's reward are otherwise not fully understood.

artificial intelligence, machine learning, reinforcement learning, (20 more...)

1712.06365

Country: Europe (0.46)

Genre: Research Report (0.41)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.66)

Fisch, Alexander Tristan Maximilian, Eckley, Idris Arthur, Fearnhead, Paul

A linear time method for the detection of point and collective anomalies

arXiv.org Machine LearningJun-5-2018

The challenge of efficiently identifying anomalies in data sequences is an important statistical problem that now arises in many applications. Whilst there has been substantial work aimed at making statistical analyses robust to outliers, or point anomalies, there has been much less work on detecting anomalous segments, or collective anomalies. By bringing together ideas from changepoint detection and robust statistics, we introduce Collective And Point Anomalies (CAPA), a computationally efficient approach that is suitable when collective anomalies are characterised by either a change in mean, variance, or both, and distinguishes them from point anomalies. Theoretical results establish the consistency of CAPA at detecting collective anomalies and empirical results show that CAPA has close to linear computational cost as well as being more accurate at detecting and locating collective anomalies than other approaches. We demonstrate the utility of CAPA through its ability to detect exoplanets from light curve data from the Kepler telescope.

anomaly, data mining, machine learning, (19 more...)

1806.01947

Country: North America > United States (0.28)

Genre: Research Report (0.70)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.46)

Technology:

Information Technology > Data Science > Data Mining (0.68)
Information Technology > Communications > Social Media (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Fahad, Md. Shah, Yadav, Jainath, Pradhan, Gyadhar, Deepak, Akshay

DNN-HMM based Speaker Adaptive Emotion Recognition using Proposed Epoch and MFCC Features

arXiv.org Artificial IntelligenceJun-4-2018

Speech is produced when time varying vocal tract system is excited with time varying excitation source. Therefore, the information present in a speech such as message, emotion, language, speaker is due to the combined effect of both excitation source and vocal tract system. However, there is very less utilization of excitation source features to recognize emotion. In our earlier work, we have proposed a novel method to extract glottal closure instants (GCIs) known as epochs. In this paper, we have explored epoch features namely instantaneous pitch, phase and strength of epochs for discriminating emotions. We have combined the excitation source features and the well known Male-frequency cepstral coefficient (MFCC) features to develop an emotion recognition system with improved performance. DNN-HMM speaker adaptive models have been developed using MFCC, epoch and combined features. IEMOCAP emotional database has been used to evaluate the models. The average accuracy for emotion recognition system when using MFCC and epoch features separately is 59.25% and 54.52% respectively. The recognition performance improves to 64.2% when MFCC and epoch features are combined.

artificial intelligence, emotion, machine learning, (15 more...)

1806.00984

Country: Asia > India (0.28)

Genre: Research Report > New Finding (0.46)

Industry: Health & Medicine > Therapeutic Area > Psychiatry/Psychology > Mental Health (0.34)

Technology:

Information Technology > Artificial Intelligence > Speech (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
(2 more...)

Mei, Jonathan, Moura, José M. F.

EigenNetworks

arXiv.org Machine LearningJun-4-2018

In many applications, the interdependencies among a set of $N$ time series $\{ x_{nk}, k>0 \}_{n=1}^{N}$ are well captured by a graph or network $G$. The network itself may change over time as well (i.e., as $G_k$). We expect the network changes to be at a much slower rate than that of the time series. This paper introduces eigennetworks, networks that are building blocks to compose the actual networks $G_k$ capturing the dependencies among the time series. These eigennetworks can be estimated by first learning the time series of graphs $G_k$ from the data, followed by a Principal Network Analysis procedure. Algorithms for learning both the original time series of graphs and the eigennetworks are presented and discussed. Experiments on simulated and real time series data demonstrate the performance of the learning and the interpretation of the eigennetworks.

artificial intelligence, changepoint, machine learning, (16 more...)

1806.01455

Country: North America > United States (1.00)

Genre: Research Report (0.40)

Industry:

Government > Voting & Elections (1.00)
Government > Regional Government > North America Government > United States Government (1.00)
Health & Medicine (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Hanna, Josiah, Niekum, Scott, Stone, Peter

Importance Sampling Policy Evaluation with an Estimated Behavior Policy

arXiv.org Machine LearningJun-4-2018

In reinforcement learning, off-policy evaluation is the task of using data generated by one policy to determine the expected return of a second policy. Importance sampling is a standard technique for off-policy evaluation, allowing off-policy data to be used as if it were on-policy. When the policy that generated the off-policy data is unknown, the ordinary importance sampling estimator cannot be applied. In this paper, we study a family of regression importance sampling (RIS) methods that apply importance sampling by first estimating the behavior policy. We find that these estimators give strong empirical performance---surprisingly often outperforming importance sampling with the true behavior policy in both discrete and continuous domains. Our results emphasize the importance of estimating the behavior policy using only the data that will also be used for the importance sampling estimate.

behavior policy, machine learning, reinforcement learning, (18 more...)

1806.01347

Country: North America > United States > Texas (0.28)

Genre: Research Report > New Finding (0.88)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.90)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Nishio, Daichi, Yamane, Satoshi

Faster Deep Q-learning using Neural Episodic Control

arXiv.org Artificial IntelligenceJun-3-2018

The research on deep reinforcement learning which estimates Q-value by deep learning has been attracted the interest of researchers recently. In deep reinforcement learning, it is important to efficiently learn the experiences that an agent has collected by exploring environment. We propose NEC2DQN that improves learning speed of a poor sample efficiency algorithm such as DQN by using good one such as NEC at the beginning of learning. We show it is able to learn faster than Double DQN or N-step DQN in the experiments of Pong.

dqn, machine learning, reinforcement learning, (17 more...)

1801.01968

Country: Asia > Japan (0.15)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

arXiv.org Artificial IntelligenceJun-2-2018

Closed-loop Bayesian Semantic Data Fusion for Collaborative Human-Autonomy Target Search

Burks, Luke, Loefgren, Ian, Barbier, Luke, Muesing, Jeremy, McGinley, Jamison, Vunnam, Sousheel, Ahmed, Nisar

In search applications, autonomous unmanned vehicles must be able to efficiently reacquire and localize mobile targets that can remain out of view for long periods of time in large spaces. As such, all available information sources must be actively leveraged -- including imprecise but readily available semantic observations provided by humans. To achieve this, this work develops and validates a novel collaborative human-machine sensing solution for dynamic target search. Our approach uses continuous partially observable Markov decision process (CPOMDP) planning to generate vehicle trajectories that optimally exploit imperfect detection data from onboard sensors, as well as semantic natural language observations that can be specifically requested from human sensors. The key innovation is a scalable hierarchical Gaussian mixture model formulation for efficiently solving CPOMDPs with semantic observations in continuous dynamic state spaces. The approach is demonstrated and validated with a real human-robot team engaged in dynamic indoor target search and capture scenarios on a custom testbed.

artificial intelligence, machine learning, natural language, (16 more...)

1806.00727

Country: North America > United States > Colorado (0.28)

Genre: Research Report (0.50)

Industry: Energy > Renewable > Geothermal > Geothermal Energy Systems and Facilities > Geothermal System for Power Generation > Advanced Geothermal System (AGS) (0.40)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Martin, Charles P., Ellefsen, Kai Olav, Torresen, Jim

Deep Predictive Models in Interactive Music

arXiv.org Artificial IntelligenceJun-1-2018

Musical performance requires prediction to operate instruments, to perform in groups and to improvise. We argue, with reference to a number of digital music instruments (DMIs), including two of our own, that predictive machine learning models can help interactive systems to understand their temporal context and ensemble behaviour. We also discuss how recent advances in deep learning highlight the role of prediction in DMIs, by allowing data-driven predictive models with a long memory of past states. We advocate for predictive musical interaction, where a predictive model is embedded in a musical interface, assisting users by predicting unknown states of musical processes. We propose a framework for characterising prediction as relating to the instrumental sound, ongoing musical process, or between members of an ensemble. Our framework shows that different musical interface design configurations lead to different types of prediction. We show that our framework accommodates deep generative models, as well as models for predicting gestural states, or other high-level musical information. We apply our framework to examples from our recent work and the literature, and discuss the benefits and challenges revealed by these systems as well as musical use-cases where prediction is a necessary component.

artificial intelligence, machine learning, prediction, (17 more...)

1801.10492

Country:

Europe > United Kingdom > England (0.28)
North America > United States > New York > New York County > New York City (0.14)

Genre: Research Report (1.00)

Industry:

Media > Music (1.00)
Leisure & Entertainment (1.00)
Health & Medicine > Therapeutic Area > Neurology (0.67)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Human Computer Interaction (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Bacciu, Davide, Castellana, Daniele

Learning Tree Distributions by Hidden Markov Models

arXiv.org Machine LearningMay-31-2018

Hidden tree Markov models allow learning distributions for tree structured data while being interpretable as nondeterministic automata. We provide a concise summary of the main approaches in literature, focusing in particular on the causality assumptions introduced by the choice of a specific tree visit direction. We will then sketch a novel non-parametric generalization of the bottom-up hidden tree Markov model with its interpretation as a nondeterministic tree automaton with infinite states.

artificial intelligence, htmm, machine learning, (17 more...)

1805.12372

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)