AITopics

2009.07346

Country:

North America > United States > California > San Francisco County > San Francisco (0.27)
Europe > Middle East > Cyprus (0.27)
North America > United States > Massachusetts (0.14)
(7 more...)

Genre:

Instructional Material (1.00)
Research Report > New Finding (0.92)

Industry:

Education (1.00)
Media (0.93)
Marketing (0.68)
(3 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Constraint-Based Reasoning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
(3 more...)

arXiv.org Machine LearningSep-15-2020

Efficient Reinforcement Learning in Factored MDPs with Application to Constrained RL

Chen, Xiaoyu, Hu, Jiachen, Li, Lihong, Wang, Liwei

Reinforcement learning (RL) in episodic, factored Markov decision processes (FMDPs) is studied. We propose an algorithm called FMDP-BF, which leverages the factorization structure of FMDP. The regret of FMDP-BF is shown to be exponentially smaller than that of optimal algorithms designed for non-factored MDPs, and improves on the best previous result for FMDPs~\citep{osband2014near} by a factored of $\sqrt{H|\mathcal{S}_i|}$, where $|\mathcal{S}_i|$ is the cardinality of the factored state subspace and $H$ is the planning horizon. To show the optimality of our bounds, we also provide a lower bound for FMDP, which indicates that our algorithm is near-optimal w.r.t. timestep $T$, horizon $H$ and factored state-action subspace cardinality. Finally, as an application, we study a new formulation of constrained RL, known as RL with knapsack constraints (RLwK), and provides the first sample-efficient algorithm based on FMDP-BF.

artificial intelligence, machine learning, reinforcement learning, (18 more...)

2008.13319

Country:

Asia > Middle East > Jordan (0.04)
Asia > China > Beijing > Beijing (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.66)

Vadori, Nelson, Ganesh, Sumitra, Reddy, Prashant, Veloso, Manuela

Risk-Sensitive Reinforcement Learning: a Martingale Approach to Reward Uncertainty

arXiv.org Machine LearningSep-15-2020

We introduce a novel framework to account for sensitivity to rewards uncertainty in sequential decision-making problems. While risk-sensitive formulations for Markov decision processes studied so far focus on the distribution of the cumulative reward as a whole, we aim at learning policies sensitive to the uncertain/stochastic nature of the rewards, which has the advantage of being conceptually more meaningful in some cases. To this end, we present a new decomposition of the randomness contained in the cumulative reward based on the Doob decomposition of a stochastic process, and introduce a new conceptual tool - the \textit{chaotic variation} - which can rigorously be interpreted as the risk measure of the martingale component associated to the cumulative reward process. We innovate on the reinforcement learning side by incorporating this new risk-sensitive approach into model-free algorithms, both policy gradient and value function based, and illustrate its relevance on grid world and portfolio optimization problems.

artificial intelligence, machine learning, reinforcement learning, (17 more...)

doi: 10.1145/3383455.3422519

2006.12686

Genre: Research Report (0.50)

Industry: Banking & Finance (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.48)

arXiv.org Artificial IntelligenceSep-13-2020

A Survey of FPGA-Based Robotic Computing

Wan, Zishen, Yu, Bo, Li, Thomas Yuang, Tang, Jie, Zhu, Yuhao, Wang, Yu, Raychowdhury, Arijit, Liu, Shaoshan

Recent researches on robotics have shown significant improvement, spanning from algorithms, mechanics to hardware architectures. Robotics, including manipulators, legged robots, drones, and autonomous vehicles, are now widely applied in diverse scenarios. However, the high computation and data complexity of robotic algorithms pose great challenges to its applications. On the one hand, CPU platform is flexible to handle multiple robotic tasks. GPU platform has higher computational capacities and easy-touse development frameworks, so they have been widely adopted in several applications. On the other hand, FPGA-based robotic accelerators are becoming increasingly competitive alternatives, especially in latency-critical and power-limited scenarios. With specialized designed hardware logic and algorithm kernels, FPGA-based accelerators can surpass CPU and GPU in performance and energy efficiency. In this paper, we give an overview of previous work on FPGA-based robotic accelerators covering different stages of the robotic system pipeline. An analysis of software and hardware optimization techniques and main technical issues is presented, along with some commercial and space applications, to serve as a guide for future work. Therefore, the computation and storage complexity, as well as real-time and power constraints of the robotic system, Over the last decade, we have seen significant progress hinders its wide application in latency-critical or power-limited in the development of robotics, spanning from algorithms, scenarios [13]. Various robotic systems, like Therefore, it is essential to choose a proper compute platform manipulators, legged robots, unmanned aerial vehicles, selfdriving for the robotic system. CPU and GPU are two widely cars have been designed for search and rescue [1], [2], used commercial compute platforms. CPU is designed to exploration [3], [4], package delivery [5], entertainment [6], handle a wide range of tasks quickly and is often used to [7] and more applications and scenarios. These robots are develop novel algorithms. A typical CPU can achieve 10-on the rise of demonstrating their full potential. Take drones, 100 GFLOPS with below 1GOP/J power efficiency [14]. In a type of aerial robots, for example, the number of drones contrast, GPU is designed with thousands of processor cores has grown by 2.83x between 2015 and 2019 based on the running simultaneously, which enable massive parallelism. The typical GPU can perform up to 10 TOPS performance and registered number has reached 1.32 million in 2019, and the become a good candidate for high-performance scenarios. Recently, FFA expects this number will come to 1.59 billion by 2024.

algorithm, artificial intelligence, machine learning, (19 more...)

2009.06034

Country:

Asia > Middle East > Republic of Türkiye > Karaman Province > Karaman (0.04)
Europe > Switzerland (0.04)
Asia > Japan (0.04)
(8 more...)

Genre:

Overview (1.00)
Research Report > New Finding (0.47)

Industry:

Transportation > Air (1.00)
Semiconductors & Electronics (1.00)
Information Technology > Robotics & Automation (1.00)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Vision > Image Understanding (1.00)
Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles > Drones (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Rajaraman, Nived, Yang, Lin F., Jiao, Jiantao, Ramachandran, Kannan

Toward the Fundamental Limits of Imitation Learning

arXiv.org Artificial IntelligenceSep-13-2020

Imitation learning (IL) aims to mimic the behavior of an expert policy in a sequential decision-making problem given only demonstrations. In this paper, we focus on understanding the minimax statistical limits of IL in episodic Markov Decision Processes (MDPs). We first consider the setting where the learner is provided a dataset of $N$ expert trajectories ahead of time, and cannot interact with the MDP. Here, we show that the policy which mimics the expert whenever possible is in expectation $\lesssim \frac{|\mathcal{S}| H^2 \log (N)}{N}$ suboptimal compared to the value of the expert, even when the expert follows an arbitrary stochastic policy. Here $\mathcal{S}$ is the state space, and $H$ is the length of the episode. Furthermore, we establish a suboptimality lower bound of $\gtrsim |\mathcal{S}| H^2 / N$ which applies even if the expert is constrained to be deterministic, or if the learner is allowed to actively query the expert at visited states while interacting with the MDP for $N$ episodes. To our knowledge, this is the first algorithm with suboptimality having no dependence on the number of actions, under no additional assumptions. We then propose a novel algorithm based on minimum-distance functionals in the setting where the transition model is given and the expert is deterministic. The algorithm is suboptimal by $\lesssim \min \{ H \sqrt{|\mathcal{S}| / N} ,\ |\mathcal{S}| H^{3/2} / N \}$, showing that knowledge of transition improves the minimax rate by at least a $\sqrt{H}$ factor.

learner, machine learning, reinforcement learning, (17 more...)

2009.0599

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.14)
North America > United States > California > Alameda County > Berkeley (0.04)
North America > United States > New York > New York County > New York City (0.04)
(3 more...)

Genre: Research Report > New Finding (0.45)

Industry: Transportation (0.46)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.87)

arXiv.org Artificial IntelligenceSep-13-2020

Is there a role for statistics in artificial intelligence?

Friedrich, Sarah, Antes, Gerd, Behr, Sigrid, Binder, Harald, Brannath, Werner, Dumpert, Florian, Ickstadt, Katja, Kestler, Hans, Lederer, Johannes, Leitgöb, Heinz, Pauly, Markus, Steland, Ansgar, Wilhelm, Adalbert, Friede, Tim

The research on and application of artificial intelligence (AI) has triggered a comprehensive scientific, economic, social and political discussion. Here we argue that statistics, as an interdisciplinary scientific field, plays a substantial role both for the theoretical and practical understanding of AI and for its future development. Statistics might even be considered a core element of AI. With its specialist knowledge of data evaluation, starting with the precise formulation of the research question and passing through a study design stage on to analysis and interpretation of the results, statistics is a natural partner for other disciplines in teaching, research and practice. This paper aims at contributing to the current discussion by highlighting the relevance of statistical methodology in the context of AI development. In particular, we discuss contributions of statistics to the field of artificial intelligence concerning methodological development, planning and design of studies, assessment of data quality and data collection, differentiation of causality and associations and assessment of uncertainty in results. Moreover, the paper also deals with the equally necessary and meaningful extension of curricula in schools and universities.

data mining, machine learning, natural language, (19 more...)

2009.0907

Country:

Europe > Germany > Lower Saxony > Gottingen (0.14)
North America > United States > New York (0.05)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(13 more...)

Genre:

Research Report > Strength High (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Government Relations & Public Policy (0.93)
Leisure & Entertainment > Games (0.93)
(6 more...)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(6 more...)

arXiv.org Machine LearningSep-12-2020

Supervised Learning with Projected Entangled Pair States

Cheng, Song, Wang, Lei, Zhang, Pan

Tensor networks, a model that originated from quantum physics, has been gradually generalized as efficient models in machine learning in recent years. However, in order to achieve exact contraction, only tree-like tensor networks such as the matrix product states and tree tensor networks have been considered, even for modeling two-dimensional data such as images. In this work, we construct supervised learning models for images using the projected entangled pair states (PEPS), a two-dimensional tensor network having a similar structure prior to natural images. Our approach first performs a feature map, which transforms the image data to a product state on a grid, then contracts the product state to a PEPS with trainable parameters to predict image labels. The tensor elements of PEPS are trained by minimizing differences between training labels and predicted labels. The proposed model is evaluated on image classifications using the MNIST and the Fashion-MNIST datasets. We show that our model is significantly superior to existing models using tree-like tensor networks. Moreover, using the same input features, our method performs as well as the multilayer perceptron classifier, but with much fewer parameters and is more stable. Our results shed light on potential applications of two-dimensional tensor network models in machine learning.

artificial intelligence, machine learning, tensor network, (17 more...)

2009.09932

Country:

Asia > China > Beijing > Beijing (0.05)
Asia > Middle East > Jordan (0.04)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Morad, Steven D., Mecca, Roberto, Poudel, Rudra P. K., Liwicki, Stephan, Cipolla, Roberto

Embodied Visual Navigation with Automatic Curriculum Learning in Real Environments

arXiv.org Artificial IntelligenceSep-11-2020

We present NavACL, a method of automatic curriculum learning tailored to the navigation task. NavACL is simple to train and efficiently selects relevant tasks using geometric features. In our experiments, deep reinforcement learning agents trained using NavACL in collision-free environments significantly outperform state-of-the-art agents trained with uniform sampling -- the current standard. Furthermore, our agents are able to navigate through unknown cluttered indoor environments to semantically-specified targets using only RGB images. Collision avoidance policies and frozen feature networks support transfer to unseen real-world environments, without any modification or retraining requirements. We evaluate our policies in simulation, and in the real world on a ground robot and a quadrotor drone. Videos of real-world results are available in the supplementary material

machine learning, navigation, reinforcement learning, (16 more...)

2009.05429

Country: Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.70)

Industry:

Leisure & Entertainment (0.46)
Transportation (0.34)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

arXiv.org Artificial IntelligenceSep-11-2020

Aggregating Long-Term Context for Learning Surgical Workflows

Ban, Yutong, Rosman, Guy, Ward, Thomas, Hashimoto, Daniel, Kondo, Taisei, Iwaki, Hidekazu, Meireles, Ozanan, Rus, Daniela

Analyzing surgical workflow is crucial for computers to understand surgeries. Deep learning techniques have recently been widely applied to recognize surgical workflows. Many of the existing temporal neural network models are limited in their capability to handle long-term dependencies in the data, instead of relying upon strong performance of the underlying per-frame visual models. We propose a new temporal network structure that leverages task-specific network representation to collect long-term sufficient statistics that are propagated by a sufficient statistics model (SSM). We leverage our approach within an LSTM back-bone for the task of surgical phase recognition and explore several choices for propagated statistics. We demonstrate superior results over existing state-of-the-art segmentation and novel segmentation techniques, on two laparoscopic cholecystectomy datasets: the already published Cholec80dataset and MGH100, a novel dataset with more challenging, yet clinically meaningful, segment labels.

artificial intelligence, information, machine learning, (16 more...)

2009.00681

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > Massachusetts > Suffolk County > Boston (0.04)
North America > Puerto Rico > San Juan > San Juan (0.04)

Genre: Research Report (0.64)

Industry:

Health & Medicine > Surgery (0.71)
Health & Medicine > Diagnostic Medicine > Imaging (0.69)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.47)

Bortolussi, Luca, Gallo, Giuseppe Maria, Nenzi, Laura

A kernel function for Signal Temporal Logic formulae

arXiv.org Machine LearningSep-11-2020

Signal Temporal Logic (STL) [14] is gaining momentum as a requirement specification language for complex systems and, in particular, Cyber-Physical Systems [4]. STL has been applied in several flavours, from Runtime-monitoring [4] to control synthesis [10] and falsification problems [9], and recently also within learning algorithms, trying to find a maximally discriminating formula between sets of trajectories [5, 3, 16]. In these applications, a central role is played by the real-valued quantitative semantics [8], measuring robustness of satisfaction. Most of the applications of STL have been applied to deterministic (hybrid) systems, with less emphasis on non-deterministic or stochastic ones [2]. Another area in which formal methods are providing interesting tools is in logic-based distances between models, like bisimulation metrics for Markov models [1], which are typically based on a branching logic. In fact, extending these ideas to linear time logic is hard [7], and typically requires statistical approximations.

artificial intelligence, formulae, machine learning, (17 more...)

2009.05484

Country:

Europe > Austria > Vienna (0.14)
North America > Canada > Quebec > Capitale-Nationale Region > Québec (0.04)
North America > Canada > Quebec > Capitale-Nationale Region > Quebec City (0.04)
(6 more...)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.34)