AITopics

2003.07775

Country:

Europe > Austria > Vienna (0.14)
Europe > Germany > Baden-Württemberg > Freiburg (0.05)
North America > United States > Colorado > Denver County > Denver (0.04)

Genre: Research Report (1.00)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (0.88)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.41)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.40)

Bai, Qinbo, Gattami, Ather, Aggarwal, Vaneet

Model-Free Algorithm and Regret Analysis for MDPs with Peak Constraints

arXiv.org Machine LearningMar-11-2020

In the optimization of dynamic systems, the variables typically have constraints. Such problems can be modeled as a constrained Markov Decision Process (MDP). This paper considers a model-free approach to the problem, where the transition probabilities are not known. In the presence of peak constraints, the agent has to choose the policy to maximize the long-term average reward as well as satisfy the constraints at each time. We propose modifications to the standard Q-learning problem for unconstrained optimization to come up with an algorithm with peak constraints. The proposed algorithm is shown to achieve $O(T^{1/2+\gamma})$ regret bound for the obtained reward, and $O(T^{1-\gamma})$ regret bound for the constraint violation for any $\gamma \in(0,1/2)$ and time-horizon $T$. We note that these are the first results on regret analysis for constrained MDP, where the transition problems are not known apriori. We demonstrate the proposed algorithm on an energy harvesting problem where it outperforms state-of-the-art and performs close to the theoretical upper bound of the studied optimization problem.

algorithm, constraint, constraint violation, (15 more...)

2003.05555

Country:

Oceania > Australia > New South Wales > Sydney (0.04)
North America > United States > Indiana > Tippecanoe County > West Lafayette (0.04)
North America > United States > Indiana > Tippecanoe County > Lafayette (0.04)
(2 more...)

Genre: Research Report (0.50)

Industry: Energy > Energy Storage (0.49)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.71)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.35)

Loper, Jackson, Blei, David, Cunningham, John P., Paninski, Liam

General linear-time inference for Gaussian Processes on one dimension

arXiv.org Machine LearningMar-11-2020

Gaussian Processes (GPs) provide a powerful probabilistic framework for interpolation, forecasting, and smoothing, but have been hampered by computational scaling issues. Here we prove that for data sampled on one dimension (e.g., a time series sampled at arbitrarily-spaced intervals), approximate GP inference at any desired level of accuracy requires computational effort that scales linearly with the number of observations; this new theorem enables inference on much larger datasets than was previously feasible. To achieve this improved scaling we propose a new family of stationary covariance kernels: the Latent Exponentially Generated (LEG) family, which admits a convenient stable state-space representation that allows linear-time inference. We prove that any continuous integrable stationary kernel can be approximated arbitrarily well by some member of the LEG family. The proof draws connections to Spectral Mixture Kernels, providing new insight about the flexibility of this popular family of kernels. We propose parallelized algorithms for performing inference and learning in the LEG model, test the algorithm on real and synthetic data, and demonstrate scaling to datasets with billions of samples.

inference, kernel, matrix, (15 more...)

2003.05554

Country:

North America > United States > California > San Diego County > La Jolla (0.14)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > Hawaii (0.04)
(2 more...)

Genre: Research Report (0.63)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)

Jamieson, Stewart, How, Jonathan P., Girdhar, Yogesh

Active Reward Learning for Co-Robotic Vision Based Exploration in Bandwidth Limited Environments

arXiv.org Artificial IntelligenceMar-10-2020

We present a novel POMDP problem formulation for a robot that must autonomously decide where to go to collect new and scientifically relevant images given a limited ability to communicate with its human operator. From this formulation we derive constraints and design principles for the observation model, reward model, and communication strategy of such a robot, exploring techniques to deal with the very high-dimensional observation space and scarcity of relevant training data. We introduce a novel active reward learning strategy based on making queries to help the robot minimize path "regret" online, and evaluate it for suitability in autonomous visual exploration through simulations. We demonstrate that, in some bandwidth-limited environments, this novel regret-based criterion enables the robotic explorer to collect up to 17% more reward per mission than the next-best criterion.

exploration, robot, trajectory, (15 more...)

2003.05016

Country:

North America > Canada > Quebec > Montreal (0.14)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
North America > United States > Utah > Salt Lake County > Salt Lake City (0.04)
(8 more...)

Genre: Research Report (0.65)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.68)

Ying, Haoyun, Mao, Keheng, Mosegaard, Klaus

Moving Target Monte Carlo

arXiv.org Machine LearningMar-10-2020

The Markov Chain Monte Carlo (MCMC) methods are popular when considering sampling from a high-dimensional random variable $\mathbf{x}$ with possibly unnormalised probability density $p$ and observed data $\mathbf{d}$. However, MCMC requires evaluating the posterior distribution $p(\mathbf{x}|\mathbf{d})$ of the proposed candidate $\mathbf{x}$ at each iteration when constructing the acceptance rate. This is costly when such evaluations are intractable. In this paper, we introduce a new non-Markovian sampling algorithm called Moving Target Monte Carlo (MTMC). The acceptance rate at $n$-th iteration is constructed using an iteratively updated approximation of the posterior distribution $a_n(\mathbf{x})$ instead of $p(\mathbf{x}|\mathbf{d})$. The true value of the posterior $p(\mathbf{x}|\mathbf{d})$ is only calculated if the candidate $\mathbf{x}$ is accepted. The approximation $a_n$ utilises these evaluations and converges to $p$ as $n \rightarrow \infty$. A proof of convergence and estimation of convergence rate in different situations are given.

approximation, approximation distribution, probability, (14 more...)

2003.04873

Country:

Europe > Switzerland > Zürich > Zürich (0.04)
Europe > Denmark > Capital Region > Copenhagen (0.04)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.36)

arXiv.org Artificial IntelligenceMar-10-2020

Curriculum Learning for Reinforcement Learning Domains: A Framework and Survey

Narvekar, Sanmit, Peng, Bei, Leonetti, Matteo, Sinapov, Jivko, Taylor, Matthew E., Stone, Peter

Reinforcement learning (RL) is a popular paradigm for addressing sequential decision tasks in which the agent has only limited environmental feedback. Despite many advances over the past three decades, learning in many domains still requires a large amount of interaction with the environment, which can be prohibitively expensive in realistic scenarios. To address this problem, transfer learning has been applied to reinforcement learning such that experience gained in one task can be leveraged when starting to learn the next, harder task. More recently, several lines of research have explored how tasks, or data samples themselves, can be sequenced into a curriculum for the purpose of learning a problem that may otherwise be too difficult to learn from scratch. In this article, we present a framework for curriculum learning (CL) in reinforcement learning, and use it to survey and classify existing CL methods in terms of their assumptions, capabilities, and goals. Finally, we use our framework to find open problems and suggest directions for future RL curriculum learning research.

agent, curriculum, learning, (14 more...)

2003.0496

Country:

North America > United States > Texas > Travis County > Austin (0.14)
North America > United States > California > San Francisco County > San Francisco (0.14)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
(8 more...)

Genre:

Research Report (1.00)
Overview (1.00)
Instructional Material > Course Syllabus & Notes (0.87)

Industry:

Leisure & Entertainment > Games > Computer Games (1.00)
Education > Educational Setting (0.68)
Education > Educational Technology > Educational Software > Computer Based Training (0.67)
Government > Regional Government (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)
(2 more...)

Tamari, Ronen, Stanovsky, Gabriel, Shahaf, Dafna, Tsarfaty, Reut

Ecological Semantics: Programming Environments for Situated Language Understanding

arXiv.org Artificial IntelligenceMar-10-2020

Large-scale natural language understanding (NLU) systems have made impressive progress: they can be applied flexibly across a variety of tasks, and employ minimal structural assumptions. However, extensive empirical research has shown this to be a double-edged sword, coming at the cost of shallow understanding: inferior generalization, grounding and explainability. Grounded language learning approaches offer the promise of deeper understanding by situating learning in richer, more structured training environments, but are limited in scale to relatively narrow, predefined domains. How might we enjoy the best of both worlds: grounded, general NLU? Following extensive contemporary cognitive science, we propose treating environments as ``first-class citizens'' in semantic representations, worthy of research and development in their own right. Importantly, models should also be partners in the creation and configuration of environments, rather than just actors within them, as in existing approaches. To do so, we argue that models must begin to understand and program in the language of affordances (which define possible actions in a given situation) both for online, situated discourse comprehension, as well as large-scale, offline common-sense knowledge mining. To this end we propose an environment-oriented ecological semantics, outlining theoretical and practical approaches towards implementation. We further provide actual demonstrations building upon interactive fiction programming languages.

bridging ai and cognitive science, knowledge, representation, (12 more...)

2003.04567

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > New Mexico > Santa Fe County > Santa Fe (0.04)
North America > United States > California (0.04)
(4 more...)

Genre: Research Report (0.41)

Industry: Education (0.87)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Biza, Ondrej, Platt, Robert, van de Meent, Jan-Willem, Wong, Lawson L. S.

Learning discrete state abstractions with deep variational inference

arXiv.org Machine LearningMar-9-2020

Abstraction is crucial for effective sequential decision making in domains with large state spaces. In this work, we propose a variational information bottleneck method for learning approximate bisimulations, a type of state abstraction. We use a deep neural net encoder to map states onto continuous embeddings. The continuous latent space is then compressed into a discrete representation using an action-conditioned hidden Markov model, which is trained end-to-end with the neural network. Our method is suited for environments with high-dimensional states and learns from a stream of experience collected by an agent acting in a Markov decision process. Through a learned discrete abstract model, we can efficiently plan for unseen goals in a multi-goal Reinforcement Learning setting. We test our method in simplified robotic manipulation domains with image states. We also compare it against previous model-based approaches to finding bisimulations in discrete grid-world-like environments.

abstraction, bisimulation, state abstraction, (15 more...)

2003.043

Country:

North America > United States > New York > New York County > New York City (0.14)
North America > United States > Massachusetts > Suffolk County > Boston (0.04)
North America > United States > California > San Diego County > San Diego (0.04)
(4 more...)

Genre: Research Report (0.50)

Industry: Leisure & Entertainment (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Zhang, Haotian, Sun, Jianyong, Xu, Zongben

Learning to be Global Optimizer

arXiv.org Artificial IntelligenceMar-9-2020

The advancement of artificial intelligence has cast a new light on the development of optimization algorithm. This paper proposes to learn a two-phase (including a minimization phase and an escaping phase) global optimization algorithm for smooth non-convex functions. For the minimization phase, a model-driven deep learning method is developed to learn the update rule of descent direction, which is formalized as a nonlinear combination of historical information, for convex functions. We prove that the resultant algorithm with the proposed adaptive direction guarantees convergence for convex functions. Empirical study shows that the learned algorithm significantly outperforms some well-known classical optimization algorithms, such as gradient descent, conjugate descent and BFGS, and performs well on ill-posed functions. The escaping phase from local optimum is modeled as a Markov decision process with a fixed escaping policy. We further propose to learn an optimal escaping policy by reinforcement learning. The effectiveness of the escaping policies is verified by optimizing synthesized functions and training a deep neural network for CIFAR image classification. The learned two-phase global optimization algorithm demonstrates a promising global search capability on some benchmark functions and machine learning tasks.

algorithm, local minimizer, optimization algorithm, (12 more...)

2003.04521

Country:

Asia > China > Shaanxi Province > Xi'an (0.04)
North America > United States > New York (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(3 more...)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.34)

arXiv.org Artificial IntelligenceMar-9-2020

Integrating Acting, Planning and Learning in Hierarchical Operational Models

Patra, Sunandita, Mason, James, Kumar, Amit, Ghallab, Malik, Traverso, Paolo, Nau, Dana

We present new planning and learning algorithms for RAE, the Refinement Acting Engine. RAE uses hierarchical operational models to perform tasks in dynamically changing environments. Our planning procedure, UPOM, does a UCT-like search in the space of operational models in order to find a near-optimal method to use for the task and context at hand. Our learning strategies acquire, from online acting experiences and/or simulated planning results, a mapping from decision contexts to method instances as well as a heuristic function to guide UPOM. Our experimental results show that UPOM and our learning strategies significantly improve RAE's performance in four test domains using two different metrics: efficiency and success ratio.

rae, raeplan, upom, (16 more...)

2003.03932

Country:

North America > United States > Maryland > Prince George's County > College Park (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Sweden > Stockholm > Stockholm (0.04)
(2 more...)

Genre: Research Report > New Finding (0.66)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
(3 more...)