AITopics

doi: 10.1145/3357384.3357929

1909.00525

Country: North America > United States > Texas > Travis County > Austin (0.24)

Genre:

Overview (0.93)
Research Report > New Finding (0.48)

Industry: Energy > Power Industry (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Data Science > Data Mining (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

#artificialintelligenceAug-30-2019, 22:21:09 GMT

Policy Certificates and Minimax-Optimal PAC Bounds for Episodic Reinforcement Learning

Designing reinforcement learning methods which find a good policy with as few samples as possible is a key goal of both empirical and theoretical research. On the theoretical side there are two main ways, regret- or PAC (probably approximately correct) bounds, to measure and guarantee sample-efficiency of a method. Ideally, we would like to have algorithms that have good performance according to both criteria, as they measure different aspects of sample efficiency and we have shown previously [1] that one cannot simply go from one to the other. In a specific setting called tabular episodic MDPs, a recent algorithm achieved close to optimal regret bounds [2] but there was no methods known to be close to optimal according to the PAC criterion despite a long line of research. In our work presented at ICML 2019, we close this gap with a new method that achieves minimax-optimal PAC (and regret) bounds which match the statistical worst-case lower bounds in the dominating terms.

artificial intelligence, machine learning, reinforcement learning, (11 more...)

#artificialintelligence

Country: North America > United States > California > Santa Clara County > Palo Alto (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.36)

Yasuda, Yusuke, Wang, Xin, Yamagishi, Junichi

Initial investigation of an encoder-decoder end-to-end TTS framework using marginalization of monotonic hard latent alignments

arXiv.org Machine LearningAug-30-2019

End-to-end text-to-speech (TTS) synthesis is a method that directly converts input text to output acoustic features using a single network. A recent advance of end-to-end TTS is due to a key technique called attention mechanisms, and all successful methods proposed so far have been based on soft attention mechanisms. However, although network structures are becoming increasingly complex, end-to-end TTS systems with soft attention mechanisms may still fail to learn and to predict accurate alignment between the input and output. This may be because the soft attention mechanisms are too flexible. Therefore, we propose an approach that has more explicit but natural constraints suitable for speech signals to make alignment learning and prediction of end-to-end TTS systems more robust. The proposed system, with the constrained alignment scheme borrowed from segment-to-segment neural transduction (SSNT), directly calculates the joint probability of acoustic features and alignment given an input text. The alignment is designed to be hard and monotonically increase by considering the speech nature, and it is treated as a latent variable and marginalized during training. During prediction, both the alignment and acoustic features can be generated from the probabilistic distributions. The advantages of our approach are that we can simplify many modules for the soft attention and that we can train the end-to-end TTS model using a single likelihood function. As far as we know, our approach is the first end-to-end TTS without a soft attention mechanism.

artificial intelligence, machine learning, natural language, (16 more...)

1908.11535

Country:

North America (0.46)
Asia > Japan (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Speech (1.00)
Information Technology > Artificial Intelligence > Natural Language (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

arXiv.org Machine LearningAug-29-2019

Solve fraud detection problem by using graph based learning methods

Tran, Loc, Tran, Tuan, Tran, Linh, Mai, An

Preprint submitted to RGN Publications on 21 /5/2018 Abstract The credit cards' fraud transactions detection is the important problem in machine learning field. To detect the credit cards' fraud transactions help reduce the significant loss of the credit cards' holders and the banks. To detect the credit cards' fraud transactions, data scientists normally employ the un - supervised learning techniques and supervised learning technique. In this paper, we employ the graph p - Laplacian based semi - supervised learning methods combi ned with the under - sampling technique such as Cluster Centroids to solve the credit cards' fraud transactions detection problem. Experimental results show that that the graph p - Laplacian semi - supervised learning method s outper form the current state of art graph Laplacian based semi - supervised learning method ( p 2). 2010 AMS Classi fi cation: 05C85 Keywords and phrases: graph p - Laplacian, credit card, fraud detection, semi - supervised learning Article type: Research article 1 Introduction While purchasing online, the transactions can be done by using credit cards that are issued by the bank.

artificial intelligence, inductive learning, machine learning, (15 more...)

1908.11708

Country: Asia > Vietnam > Bình Dương Province (0.14)

Genre: Research Report (1.00)

Industry:

Law Enforcement & Public Safety > Fraud (1.00)
Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.30)

arXiv.org Machine LearningAug-29-2019

Solving Discounted Stochastic Two-Player Games with Near-Optimal Time and Sample Complexity

Sidford, Aaron, Wang, Mengdi, Yang, Lin F., Ye, Yinyu

In this paper, we settle the sampling complexity of solving discounted two-player turn-based zero-sum stochastic games up to polylogarithmic factors. Given a stochastic game with discount factor $\gamma\in(0,1)$ we provide an algorithm that computes an $\epsilon$-optimal strategy with high-probability given $\tilde{O}((1 - \gamma)^{-3} \epsilon^{-2})$ samples from the transition function for each state-action-pair. Our algorithm runs in time nearly linear in the number of samples and uses space nearly linear in the number of state-action pairs. As stochastic games generalize Markov decision processes (MDPs) our runtime and sample complexities are optimal due to Azar et al (2013). We achieve our results by showing how to generalize a near-optimal Q-learning based algorithms for MDP, in particular Sidford et al (2018), to two-player strategy computation algorithms. This overcomes limitations of standard Q-learning and strategy iteration or alternating minimization based approaches and we hope will pave the way for future reinforcement learning results by facilitating the extension of MDP results to multi-agent settings with little loss.

artificial intelligence, machine learning, reinforcement learning, (18 more...)

1908.11071

Country: North America > United States > California (0.93)

Genre: Research Report (0.70)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.48)

Kaur, Navdeep, Kunapuli, Gautam, Joshi, Saket, Kersting, Kristian, Natarajan, Sriraam

Neural Networks for Relational Data

arXiv.org Artificial IntelligenceAug-28-2019

While deep networks have been enormously successful over the last decade, they rely on flat-feature vector representations, which makes them unsuitable for richly structured domains such as those arising in applications like social network analysis. Such domains rely on relational representations to capture complex relationships between entities and their attributes. Thus, we consider the problem of learning neural networks for relational data. We distinguish ourselves from current approaches that rely on expert hand-coded rules by learning relational random-walk-based features to capture local structural interactions and the resulting network architecture. We further exploit parameter tying of the network weights of the resulting relational neural network, where instances of the same type share parameters. Our experimental results across several standard relational data sets demonstrate the effectiveness of the proposed approach over multiple neural net baselines as well as state-of-the-art statistical relational models.

neural network, random walk, relational random walk, (15 more...)

1909.04723

Country:

Europe > Germany > Hesse > Darmstadt Region > Darmstadt (0.04)
North America > United States > Texas (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.47)

arXiv.org Artificial IntelligenceAug-28-2019

STMARL: A Spatio-Temporal Multi-Agent Reinforcement Learning Approach for Traffic Light Control

Wang, Yanan, Xu, Tong, Niu, Xin, Tan, Chang, Chen, Enhong, Xiong, Hui

The development of intelligent traffic light control systems is essential for smart transportation management. While some efforts have been made to optimize the use of individual traffic lights in an isolated way, related studies have largely ignored the fact that the use of multi-intersection traffic lights is spatially influenced and there is a temporal dependency of historical traffic status for current traffic light control. To that end, in this paper, we propose a novel SpatioTemporal Multi-Agent Reinforcement Learning (STMARL) framework for effectively capturing the spatio-temporal dependency of multiple related traffic lights and control these traffic lights in a coordinating way. Specifically, we first construct the traffic light adjacency graph based on the spatial structure among traffic lights. Then, historical traffic records will be integrated with current traffic status via Recurrent Neural Network structure. Moreover, based on the temporally-dependent traffic information, we design a Graph Neural Network based model to represent relationships among multiple traffic lights, and the decision for each traffic light will be made in a distributed way by the deep Q-learning method. Finally, the experimental results on both synthetic and real-world data have demonstrated the effectiveness of our STMARL framework, which also provides an insightful understanding of the influence mechanism among multi-intersection traffic lights.

artificial intelligence, machine learning, reinforcement learning, (17 more...)

1908.10577

Country: North America > United States (0.30)

Genre: Research Report (1.00)

Industry:

Transportation > Infrastructure & Services (1.00)
Transportation > Ground > Road (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Ognibene, Dimitri, Mirante, Lorenzo, Marchegiani, Letizia

Proactive Intention Recognition for Joint Human-Robot Search and Rescue Missions through Monte-Carlo Planning in POMDP Environments

arXiv.org Artificial IntelligenceAug-27-2019

Proactively perceiving others' intentions is a crucial skill to effectively interact in unstructured, dynamic and novel environments. This work proposes a first step towards embedding this skill in support robots for search and rescue missions. Predicting the responders' intentions, indeed, will enable exploration approaches which will identify and prioritise areas that are more relevant for the responder and, thus, for the task, leading to the development of safer, more robust and efficient joint exploration strategies. More specifically, this paper presents an active intention recognition paradigm to perceive, even under sensory constraints, not only the target's position but also the first responder's movements, which can provide information on his/her intentions (e.g. reaching the position where he/she expects the target to be). This mechanism is implemented by employing an extension of Monte-Carlo-based planning techniques for partially observable environments, where the reward function is augmented with an entropy reduction bonus. We test in simulation several configurations of reward augmentation, both information theoretic and not, as well as belief state approximations and obtain substantial improvements over the basic approach.

planning & scheduling, responder, upstream oil & gas, (19 more...)

1908.10125

Country: Europe (0.46)

Genre:

Research Report (0.82)
Workflow (0.68)

Industry: Energy > Oil & Gas > Upstream (0.35)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.51)

Thomas, Antony, Amatya, Sunny, Mastrogiovanni, Fulvio, Baglietto, Marco

Task-assisted Motion Planning in Partially Observable Domains

arXiv.org Artificial IntelligenceAug-27-2019

Antony Thomas and Sunny Amatya † and Fulvio Mastrogiovanni and Marco Baglietto Abstract -- We present an integrated T ask-Motion Planning framework for robot navigation in belief space. Autonomous robots operating in real world complex scenarios require planning in the discrete (task) space and the continuous (motion) space. T o this end, we propose a framework for integrating belief space reasoning within a hybrid task planner . The expressive power of PDDL combined with heuristic-driven semantic attachments performs the propagated and posterior belief estimates while planning. The underlying methodology for the development of the combined hybrid planner is discussed, providing suggestions for improvements and future work. I NTRODUCTION Autonomous robots operating in complex real world scenarios require different levels of planning to execute their tasks. High-level (task) planning helps break down a given set of tasks into a sequence of sub-tasks, actual execution of each of these sub-tasks would require low-level control actions to generate appropriate robot motions. In fact, the dependency between logical and geometrical aspects is pervasive in both task planning and execution. Hence, planning should be performed in the task-motion or the discrete-continuous space. In recent years, combining high-level task planning with low-level motion planning has been a subject of great interest among the Robotics and Artificial Intelligence (AI) community.

artificial intelligence, machine learning, motion planning, (17 more...)

1908.10227

Country: North America > United States > Arizona (0.28)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Robots > Robot Planning & Action (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

arXiv.org Machine LearningAug-26-2019

Urban flows prediction from spatial-temporal data using machine learning: A survey

Xie, Peng, Li, Tianrui, Liu, Jia, Du, Shengdong, Yang, Xin, Zhang, Junbo

Urban spatial-temporal flows prediction is of great importance to traffic management, land use, public safety, etc. Urban flows are affected by several complex and dynamic factors, such as patterns of human activities, weather, events and holidays. Datasets evaluated the flows come from various sources in different domains, e.g. mobile phone data, taxi trajectories data, metro/bus swiping data, bike-sharing data and so on. To summarize these methodologies of urban flows prediction, in this paper, we first introduce four main factors affecting urban flows. Second, in order to further analysis urban flows, a preparation process of multi-sources spatial-temporal data related with urban flows is partitioned into three groups. Third, we choose the spatial-temporal dynamic data as a case study for the urban flows prediction task. Fourth, we analyze and compare some well-known and state-of-the-art flows prediction methods in detail, classifying them into five categories: statistics-based, traditional machine learning-based, deep learning-based, reinforcement learning-based and transfer learning-based methods. Finally, we give open challenges of urban flows prediction and an outlook in the future of this field. This paper will facilitate researchers find suitable methods and open datasets for addressing urban spatial-temporal flows forecast problems.

machine learning, prediction, reinforcement learning, (18 more...)

1908.10218

Country: North America > United States (1.00)

Genre:

Research Report (1.00)
Overview (1.00)

Industry:

Transportation > Ground > Road (1.00)
Transportation > Infrastructure & Services (0.95)
Government (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Spatial Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
(2 more...)