AITopics | Country

Collaborating Authors

Country

Stochastic Gradient Hamiltonian Monte Carlo

Chen, Tianqi, Fox, Emily B., Guestrin, Carlos

arXiv.org Machine LearningMay-12-2014

Hamiltonian Monte Carlo (HMC) sampling methods provide a mechanism for defining distant proposals with high acceptance probabilities in a Metropolis-Hastings framework, enabling more efficient exploration of the state space than standard random-walk proposals. The popularity of such methods has grown significantly in recent years. However, a limitation of HMC methods is the required gradient computation for simulation of the Hamiltonian dynamical system-such computation is infeasible in problems involving a large sample size or streaming data. Instead, we must rely on a noisy gradient estimate computed from a subset of the data. In this paper, we explore the properties of such a stochastic gradient HMC approach. Surprisingly, the natural implementation of the stochastic approximation can be arbitrarily bad. To address this problem we introduce a variant that uses second-order Langevin dynamics with a friction term that counteracts the effects of the noisy gradient, maintaining the desired target distribution as the invariant distribution. Results on simulated data validate our theory. We also provide an application of our methods to a classification task using neural networks and to online Bayesian matrix factorization.

bayesian inference, neural network, stationary distribution, (18 more...)

arXiv.org Machine Learning

1402.4102

Country: North America > United States > Washington > King County > Seattle (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.75)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)

Add feedback

Policy Gradients for CVaR-Constrained MDPs

A, Prashanth L.

arXiv.org Machine LearningMay-12-2014

We study a risk-constrained version of the stochastic shortest path (SSP) problem, where the risk measure considered is Conditional Value-at-Risk (CVaR). We propose two algorithms that obtain a locally risk-optimal policy by employing four tools: stochastic approximation, mini batches, policy gradients and importance sampling. Both the algorithms incorporate a CVaR estimation procedure, along the lines of Bardou et al. [2009], which in turn is based on Rockafellar-Uryasev's representation for CVaR and utilize the likelihood ratio principle for estimating the gradient of the sum of one cost function (objective of the SSP) and the gradient of the CVaR of the sum of another cost function (in the constraint of SSP). The algorithms differ in the manner in which they approximate the CVaR estimates/necessary gradients - the first algorithm uses stochastic approximation, while the second employ mini-batches in the spirit of Monte Carlo methods. We establish asymptotic convergence of both the algorithms. Further, since estimating CVaR is related to rare-event simulation, we incorporate an importance sampling based variance reduction scheme into our proposed algorithms.

algorithm, artificial intelligence, optimization problem, (17 more...)

arXiv.org Machine Learning

1405.269

Country: Europe > United Kingdom > England (0.14)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Approximate Policy Iteration Schemes: A Comparison

Scherrer, Bruno

arXiv.org Machine LearningMay-12-2014

We consider the infinite-horizon discounted optimal control problem formalized by Markov Decision Processes. We focus on several approximate variations of the Policy Iteration algorithm: Approximate Policy Iteration, Conservative Policy Iteration (CPI), a natural adaptation of the Policy Search by Dynamic Programming algorithm to the infinite-horizon case (PSDP$_\infty$), and the recently proposed Non-Stationary Policy iteration (NSPI(m)). For all algorithms, we describe performance bounds, and make a comparison by paying a particular attention to the concentrability constants involved, the number of iterations and the memory required. Our analysis highlights the following points: 1) The performance guarantee of CPI can be arbitrarily better than that of API/API($\alpha$), but this comes at the cost of a relative---exponential in $\frac{1}{\epsilon}$---increase of the number of iterations. 2) PSDP$_\infty$ enjoys the best of both worlds: its performance guarantee is similar to that of CPI, but within a number of iterations similar to that of API. 3) Contrary to API that requires a constant memory, the memory needed by CPI and PSDP$_\infty$ is proportional to their number of iterations, which may be problematic when the discount factor $\gamma$ is close to 1 or the approximation error $\epsilon$ is close to $0$; we show that the NSPI(m) algorithm allows to make an overall trade-off between memory and performance. Simulations with these schemes confirm our analysis.

algorithm, artificial intelligence, optimization problem, (14 more...)

arXiv.org Machine Learning

1405.2878

Country: Asia > China (0.14)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.34)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.34)

Add feedback

A Novel Method for Developing Robotics via Artificial Intelligence and Internet of Things

A, Aadhityan

arXiv.org Artificial IntelligenceMay-12-2014

This paper describe about a new methodology for developing and improving the robotics field via artificial intelligence and internet of things. Now a day, we can say Artificial Intelligence take the world into robotics. Almost all industries use robots for lot of works. They are use co-operative robots to make different kind of works. But there was some problem to make robot for multi tasks. So there was a necessary new methodology to made multi tasking robots. It will be done only by artificial intelligence and internet of things.

artificial intelligence, health & medicine, robot, (13 more...)

arXiv.org Artificial Intelligence

1405.3939

Country:

North America > United States (0.15)
Asia > India (0.14)

Industry:

Information Technology > Smart Houses & Appliances (0.95)
Leisure & Entertainment (0.71)
Health & Medicine (0.70)

Technology: Information Technology > Artificial Intelligence > Robots (1.00)

Add feedback

Learning modular structures from network data and node variables

Azizi, Elham, Galagan, James E., Airoldi, Edoardo M.

arXiv.org Machine LearningMay-11-2014

A standard technique for understanding underlying dependency structures among a set of variables posits a shared conditional probability distribution for the variables measured on individuals within a group. This approach is often referred to as module networks, where individuals are represented by nodes in a network, groups are termed modules, and the focus is on estimating the network structure among modules. However, estimation solely from node-specific variables can lead to spurious dependencies, and unverifiable structural assumptions are often used for regularization. Here, we propose an extended model that leverages direct observations about the network in addition to node-specific variables. By integrating complementary data types, we avoid the need for structural assumptions. We illustrate theoretical and practical significance of the model and develop a reversible-jump MCMC learning procedure for learning modules and model parameters. We demonstrate the method accuracy in predicting modular structures from synthetic data and capability to learn influence structures in twitter data and regulatory modules in the Mycobacterium tuberculosis gene regulatory network.

bayesian inference, immunology, module, (20 more...)

arXiv.org Machine Learning

1405.2566

Country: North America > United States > Massachusetts (0.14)

Genre: Research Report > New Finding (0.46)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.87)
Health & Medicine > Therapeutic Area > Immunology (0.69)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.88)
(2 more...)

Add feedback

Off-policy reinforcement learning for $ H_\infty $ control design

Luo, Biao, Wu, Huai-Ning, Huang, Tingwen

arXiv.org Machine LearningMay-11-2014

The $H_\infty$ control design problem is considered for nonlinear systems with unknown internal system model. It is known that the nonlinear $ H_\infty $ control problem can be transformed into solving the so-called Hamilton-Jacobi-Isaacs (HJI) equation, which is a nonlinear partial differential equation that is generally impossible to be solved analytically. Even worse, model-based approaches cannot be used for approximately solving HJI equation, when the accurate system model is unavailable or costly to obtain in practice. To overcome these difficulties, an off-policy reinforcement leaning (RL) method is introduced to learn the solution of HJI equation from real system data instead of mathematical system model, and its convergence is proved. In the off-policy RL method, the system data can be generated with arbitrary policies rather than the evaluating policy, which is extremely important and promising for practical systems. For implementation purpose, a neural network (NN) based actor-critic structure is employed and a least-square NN weight update algorithm is derived based on the method of weighted residuals. Finally, the developed NN-based off-policy RL method is tested on a linear F16 aircraft plant, and further applied to a rotational/translational actuator system.

artificial intelligence, equation, reinforcement learning, (16 more...)

arXiv.org Machine Learning

doi: 10.1109/TCYB.2014.2319577

1311.6107

Country:

Asia > China (1.00)
North America > United States > Texas > Brazos County > College Station (0.14)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.83)

Add feedback

Efficient Computation of the Well-Founded Semantics over Big Data

Tachmazidis, Ilias, Antoniou, Grigoris, Faber, Wolfgang

arXiv.org Artificial IntelligenceMay-11-2014

Data originating from the Web, sensor readings and social media result in increasingly huge datasets. The so called Big Data comes with new scientific and technological challenges while creating new opportunities, hence the increasing interest in academia and industry. Traditionally, logic programming has focused on complex knowledge structures/programs, so the question arises whether and how it can work in the face of Big Data. In this paper, we examine how the well-founded semantics can process huge amounts of data through mass parallelization. More specifically, we propose and evaluate a parallel approach using the MapReduce framework. Our experimental results indicate that our approach is scalable and that well-founded semantics can be applied to billions of facts. To the best of our knowledge, this is the first work that addresses large scale nonmonotonic reasoning without the restriction of stratification for predicates of arbitrary arity. To appear in Theory and Practice of Logic Programming (TPLP).

big data, logic programming, predicate, (19 more...)

arXiv.org Artificial Intelligence

doi: 10.1017/S1471068414000131

1405.259

Country: North America > United States (0.14)

Technology:

Information Technology > Data Science > Data Mining > Big Data (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (0.89)
Information Technology > Artificial Intelligence > Representation & Reasoning > Ontologies (0.67)

Add feedback

Structural Return Maximization for Reinforcement Learning

Joseph, Joshua, Velez, Javier, Roy, Nicholas

arXiv.org Machine LearningMay-11-2014

Batch Reinforcement Learning (RL) algorithms attempt to choose a policy from a designer-provided class of policies given a fixed set of training data. Choosing the policy which maximizes an estimate of return often leads to over-fitting when only limited data is available, due to the size of the policy class in relation to the amount of data available. In this work, we focus on learning policy classes that are appropriately sized to the amount of data available. We accomplish this by using the principle of Structural Risk Minimization, from Statistical Learning Theory, which uses Rademacher complexity to identify a policy class that maximizes a bound on the return of the best policy in the chosen policy class, given the available data. Unlike similar batch RL approaches, our bound on return requires only extremely weak assumptions on the true system.

artificial intelligence, policy class, reinforcement learning, (13 more...)

arXiv.org Machine Learning

1405.2606

Country:

North America > United States > Texas (0.14)
North America > United States > New York (0.14)
North America > United States > Massachusetts (0.14)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

A PAC-Bayesian bound for Lifelong Learning

Pentina, Anastasia, Lampert, Christoph H.

arXiv.org Machine LearningMay-10-2014

Transfer learning has received a lot of attention in the machine learning community over the last years, and several effective algorithms have been developed. However, relatively little is known about their theoretical properties, especially in the setting of lifelong learning, where the goal is to transfer information to tasks for which no data have been observed so far. In this work we study lifelong learning from a theoretical perspective. Our main result is a PAC-Bayesian generalization bound that offers a unified view on existing paradigms for transfer learning, such as the transfer of parameters or the transfer of low-dimensional representations. We also use the bound to derive two principled lifelong learning algorithms, and we show that these yield results comparable with existing methods.

artificial intelligence, educational setting, predictor, (18 more...)

arXiv.org Machine Learning

1311.2838

Country:

Europe (0.14)
Asia > China (0.14)

Genre:

Instructional Material (1.00)
Research Report (0.82)

Industry: Education > Educational Setting > Continuing Education (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (0.69)

Add feedback

Functional Bandits

Tran-Thanh, Long, Yu, Jia Yuan

arXiv.org Machine LearningMay-10-2014

We introduce the functional bandit problem, where the objective is to find an arm that optimises a known functional of the unknown arm-reward distributions. These problems arise in many settings such as maximum entropy methods in natural language processing, and risk-averse decision-making, but current best-arm identification techniques fail in these domains. We propose a new approach, that combines functional estimation and arm elimination, to tackle this problem. This method achieves provably efficient performance guarantees. In addition, we illustrate this method on a number of important functionals in risk management and information theory, and refine our generic theoretical results in those cases.

algorithm, artificial intelligence, natural language, (20 more...)

arXiv.org Machine Learning

1405.2432

Country: Europe > United Kingdom > England (0.14)

Genre: Research Report (0.50)

Industry: Information Technology (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Data Science > Data Mining (0.90)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.66)

Add feedback