AITopics | Jaillet, Patrick

Gaussian Process Planning with Lipschitz Continuous Reward Functions: Towards Unifying Bayesian Optimization, Active Learning, and Beyond

Ling, Chun Kai (National University of Singapore) | Low, Kian Hsiang (National University of Singapore) | Jaillet, Patrick (Massachusetts Institute of Technology)

AAAI ConferencesApr-19-2016

This paper presents a novel nonmyopic adaptive Gaussian process planning (GPP) framework endowed with a general class of Lipschitz continuous reward functions that can unify some active learning/sensing and Bayesian optimization criteria and offer practitioners some flexibility to specify their desired choices for defining new tasks/problems. In particular, it utilizes a principled Bayesian sequential decision problem framework for jointly and naturally optimizing the exploration-exploitation trade-off. In general, the resulting induced GPP policy cannot be derived exactly due to an uncountable set of candidate observations. A key contribution of our work here thus lies in exploiting the Lipschitz continuity of the reward functions to solve for a nonmyopic adaptive epsilon-optimal GPP (epsilon-GPP) policy. To plan in real time, we further propose an asymptotically optimal, branch-and-bound anytime variant of epsilon-GPP with performance guarantee. We empirically demonstrate the effectiveness of our epsilon-GPP policy and its anytime variant in Bayesian optimization and an energy harvesting task.

artificial intelligence, reward function, upstream oil & gas, (17 more...)

AAAI Conferences

Thirtieth AAAI Conference on Artificial Intelligence

Country: North America > United States > Massachusetts (0.14)

Industry: Energy > Oil & Gas > Upstream (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.47)

Add feedback

Inverse Reinforcement Learning with Locally Consistent Reward Functions

Nguyen, Quoc Phong, Low, Bryan Kian Hsiang, Jaillet, Patrick

Neural Information Processing SystemsDec-31-2015

Existing inverse reinforcement learning (IRL) algorithms have assumed each expert’s demonstrated trajectory to be produced by only a single reward function. This paper presents a novel generalization of the IRL problem that allows each trajectory to be generated by multiple locally consistent reward functions, hence catering to more realistic and complex experts’ behaviors. Solving our generalized IRL problem thus involves not only learning these reward functions but also the stochastic transitions between them at any state (including unvisited states). By representing our IRL problem with a probabilistic graphical model, an expectation-maximization (EM) algorithm can be devised to iteratively learn the different reward functions and the stochastic transitions between them in order to jointly improve the likelihood of the expert’s demonstrated trajectories. As a result, the most likely partition of a trajectory into segments that are generated from different locally consistent reward functions selected by EM can be derived. Empirical evaluation on synthetic and real-world datasets shows that our IRL algorithm outperforms the state-of-the-art EM clustering with maximum likelihood IRL, which is, interestingly, a reduced variant of our approach.

artificial intelligence, ground transportation, reward function, (19 more...)

Neural Information Processing Systems

Country:

North America > United States > Massachusetts (0.14)
North America > United States > California (0.14)

Industry: Transportation > Ground > Road (0.47)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.89)

Add feedback

Gaussian Process Planning with Lipschitz Continuous Reward Functions: Towards Unifying Bayesian Optimization, Active Learning, and Beyond

Ling, Chun Kai, Low, Kian Hsiang, Jaillet, Patrick

arXiv.org Machine LearningNov-21-2015

This paper presents a novel nonmyopic adaptive Gaussian process planning (GPP) framework endowed with a general class of Lipschitz continuous reward functions that can unify some active learning/sensing and Bayesian optimization criteria and offer practitioners some flexibility to specify their desired choices for defining new tasks/problems. In particular, it utilizes a principled Bayesian sequential decision problem framework for jointly and naturally optimizing the exploration-exploitation trade-off. In general, the resulting induced GPP policy cannot be derived exactly due to an uncountable set of candidate observations. A key contribution of our work here thus lies in exploiting the Lipschitz continuity of the reward functions to solve for a nonmyopic adaptive epsilon-optimal GPP (epsilon-GPP) policy. To plan in real time, we further propose an asymptotically optimal, branch-and-bound anytime variant of epsilon-GPP with performance guarantee. We empirically demonstrate the effectiveness of our epsilon-GPP policy and its anytime variant in Bayesian optimization and an energy harvesting task.

renewable energy, st 1, upstream oil & gas, (18 more...)

arXiv.org Machine Learning

1511.0689

Country: North America > United States > Massachusetts (0.14)

Genre: Research Report (0.64)

Industry:

Energy > Renewable (0.67)
Energy > Oil & Gas > Upstream (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.47)

Add feedback

Dynamic Redeployment to Counter Congestion or Starvation in Vehicle Sharing Systems

Ghosh, Supriyo (Singapore Management University) | Varakantham, Pradeep (Singapore Management University) | Adulyasak, Yossiri (Singapore-MIT Alliance for Research and Technology (SMART)) | Jaillet, Patrick (Massachusetts Institute of Technology)

AAAI ConferencesMay-21-2015

Vehicle sharing (ex: bike sharing, car sharing) systems, an attractive alternative of private transportation, are widely adopted in major cities around the world. In vehicle-sharing systems, base stations (ex: docking stations for bikes) are strategically placed throughout a city and each of the base stations contain a pre-determined number of vehicles at the beginning of each day. Due to the stochastic and individualistic movement of customers, there is typically either congestion (more than required) or starvation (fewer than required) of vehicles at certain base stations, which causes a significant loss in demand. We propose to dynamically redeploy idle vehicles using carriers so as to minimize lost demand or alternatively maximize revenue for the vehicle sharing company. To that end, we contribute an optimization formulation to jointly address the redeployment (of vehicles) and routing (of carriers) problems and provide two approaches that rely on decomposability and abstraction of problem domains to reduce the computation time significantly.

artificial intelligence, bike, machine learning, (16 more...)

AAAI Conferences

Eighth Annual Symposium on Combinatorial Search

Country:

Asia (0.31)
North America > United States > Massachusetts (0.16)

Industry: Transportation (0.71)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.48)

Add feedback

Parallel Gaussian Process Regression for Big Data: Low-Rank Representation Meets Markov Approximation

Low, Kian Hsiang (National University of Singapore) | Yu, Jiangbo (National University of Singapore) | Chen, Jie (Singapore-MIT Alliance for Research and Technology) | Jaillet, Patrick (Massachusetts Institute of Technology)

AAAI ConferencesMar-6-2015

The expressive power of a Gaussian process (GP) model comes at a cost of poor scalability in the data size. To improve its scalability, this paper presents a low-rank-cum-Markov approximation (LMA) of the GP model that is novel in leveraging the dual computational advantages stemming from complementing a low-rank approximate representation of the full-rank GP based on a support set of inputs with a Markov approximation of the resulting residual process; the latter approximation is guaranteed to be closest in the Kullback-Leibler distance criterion subject to some constraint and is considerably more refined than that of existing sparse GP models utilizing low-rank representations due to its more relaxed conditional independence assumption (especially with larger data). As a result, our LMA method can trade off between the size of the support set and the order of the Markov property to (a) incur lower computational cost than such sparse GP models while achieving predictive performance comparable to them and (b) accurately represent features/patterns of any scale. Interestingly, varying the Markov order produces a spectrum of LMAs with PIC approximation and full-rank GP at the two extremes. An advantage of our LMA method is that it is amenable to parallelization on multiple machines/cores, thereby gaining greater scalability. Empirical evaluation on three real-world datasets in clusters of up to 32 computing nodes shows that our centralized and parallel LMA methods are significantly more time-efficient and scalable than state-of-the-art sparse and full-rank GP regression methods while achieving comparable predictive performances.

approximation, artificial intelligence, machine learning, (18 more...)

AAAI Conferences

Twenty-Ninth AAAI Conference on Artificial Intelligence

Country: North America > United States > Massachusetts (0.14)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.49)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.49)
Information Technology > Data Science > Data Mining > Big Data (0.40)

Add feedback

Solving Uncertain MDPs with Objectives that Are Separable over Instantiations of Model Uncertainty

Adulyasak, Yossiri (Singapore MIT Alliance for Research and Technology (SMART), Massachussets Institute of Technology ) | Varakantham, Pradeep (Singapore Management University) | Ahmed, Asrar (Singapore Management University) | Jaillet, Patrick (Massachussets Institute of Technology )

AAAI ConferencesMar-6-2015

Markov Decision Problems, MDPs offer an effective mechanism for planning under uncertainty. However, due to unavoidable uncertainty over models, it is difficult to obtain an exact specification of an MDP. We are interested in solving MDPs, where transition and reward functions are not exactly specified. Existing research has primarily focussed on computing infinite horizon stationary policies when optimizing robustness, regret and percentile based objectives. We focus specifically on finite horizon problems with a special emphasis on objectives that are separable over individual instantiations of model uncertainty (i.e., objectives that can be expressed as a sum over instantiations of model uncertainty): (a) First, we identify two separable objectives for uncertain MDPs: Average Value Maximization (AVM) and Confidence Probability Maximisation (CPM). (b) Second, we provide optimization based solutions to compute policies for uncertain MDPs with such objectives. In particular, we exploit the separability of AVM and CPM objectives by employing Lagrangian dual decomposition(LDD). (c) Finally, we demonstrate the utility of the LDD approach on a benchmark problem from the literature.

Add feedback

Parallel Gaussian Process Regression for Big Data: Low-Rank Representation Meets Markov Approximation

Low, Kian Hsiang, Yu, Jiangbo, Chen, Jie, Jaillet, Patrick

arXiv.org Machine LearningNov-17-2014

The expressive power of a Gaussian process (GP) model comes at a cost of poor scalability in the data size. To improve its scalability, this paper presents a low-rank-cum-Markov approximation (LMA) of the GP model that is novel in leveraging the dual computational advantages stemming from complementing a low-rank approximate representation of the full-rank GP based on a support set of inputs with a Markov approximation of the resulting residual process; the latter approximation is guaranteed to be closest in the Kullback-Leibler distance criterion subject to some constraint and is considerably more refined than that of existing sparse GP models utilizing low-rank representations due to its more relaxed conditional independence assumption (especially with larger data). As a result, our LMA method can trade off between the size of the support set and the order of the Markov property to (a) incur lower computational cost than such sparse GP models while achieving predictive performance comparable to them and (b) accurately represent features/patterns of any scale. Interestingly, varying the Markov order produces a spectrum of LMAs with PIC approximation and full-rank GP at the two extremes. An advantage of our LMA method is that it is amenable to parallelization on multiple machines/cores, thereby gaining greater scalability. Empirical evaluation on three real-world datasets in clusters of up to 32 computing nodes shows that our centralized and parallel LMA methods are significantly more time-efficient and scalable than state-of-the-art sparse and full-rank GP regression methods while achieving comparable predictive performances.

approximation, artificial intelligence, machine learning, (16 more...)

arXiv.org Machine Learning

1411.451

Country: North America > United States > Massachusetts (0.14)

Genre: Research Report (0.50)

Industry: Transportation (0.47)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.49)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)
Information Technology > Data Science > Data Mining > Big Data (0.40)

Add feedback

Parallel Gaussian Process Regression with Low-Rank Covariance Matrix Approximations

Chen, Jie, Cao, Nannan, Low, Kian Hsiang, Ouyang, Ruofei, Tan, Colin Keng-Yan, Jaillet, Patrick

arXiv.org Machine LearningAug-9-2014

Gaussian processes (GP) are Bayesian non-parametric models that are widely used for probabilistic regression. Unfortunately, it cannot scale well with large data nor perform real-time predictions due to its cubic time cost in the data size. This paper presents two parallel GP regression methods that exploit low-rank covariance matrix approximations for distributing the computational load among parallel machines to achieve time efficiency and scalability. We theoretically guarantee the predictive performances of our proposed parallel GPs to be equivalent to that of some centralized approximate GP regression methods: The computation of their centralized counterparts can be distributed among parallel machines, hence achieving greater time efficiency and scalability. We analytically compare the properties of our parallel GPs such as time, space, and communication complexity. Empirical evaluation on two real-world datasets in a cluster of 20 computing nodes shows that our parallel GPs are significantly more time-efficient and scalable than their centralized counterparts and exact/full GP while achieving predictive performances comparable to full GP.

artificial intelligence, machine learning, picf-based gp, (15 more...)

arXiv.org Machine Learning

1408.206

Country: North America > United States > Massachusetts (0.14)

Genre: Research Report (0.64)

Industry: Transportation (0.47)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.69)

Add feedback

Decentralized Data Fusion and Active Sensing with Mobile Sensors for Modeling and Predicting Spatiotemporal Traffic Phenomena

Chen, Jie, Low, Kian Hsiang, Tan, Colin Keng-Yan, Oran, Ali, Jaillet, Patrick, Dolan, John, Sukhatme, Gaurav

arXiv.org Artificial IntelligenceAug-9-2014

The problem of modeling and predicting spatiotemporal traffic phenomena over an urban road network is important to many traffic applications such as detecting and forecasting congestion hotspots. This paper presents a decentralized data fusion and active sensing (D2FAS) algorithm for mobile sensors to actively explore the road network to gather and assimilate the most informative data for predicting the traffic phenomenon. We analyze the time and communication complexity of D2FAS and demonstrate that it can scale well with a large number of observations and sensors. We provide a theoretical guarantee on its predictive performance to be equivalent to that of a sophisticated centralized sparse approximation for the Gaussian process (GP) model: The computation of such a sparse approximate GP model can thus be parallelized and distributed among the mobile sensors (in a Google-like MapReduce paradigm), thereby achieving efficient and scalable prediction. We also theoretically guarantee its active sensing performance that improves under various practical environmental conditions. Empirical evaluation on real-world urban road network data shows that our D2FAS algorithm is significantly more time-efficient and scalable than state-oftheart centralized algorithms while achieving comparable predictive performance.

ground transportation, information fusion, sensor, (18 more...)

arXiv.org Artificial Intelligence

1408.2046

Country: North America > United States > California (0.14)

Industry:

Transportation > Infrastructure & Services (1.00)
Transportation > Ground > Road (1.00)

Technology:

Information Technology > Sensing and Signal Processing (1.00)
Information Technology > Data Science > Data Integration (1.00)
Information Technology > Communications > Networks > Sensor Networks (1.00)
(4 more...)

Add feedback

Decentralized Stochastic Planning with Anonymity in Interactions

Varakantham, Pradeep (Singapore Management University) | Adulyasak, Yossiri (Singapore MIT Alliance for Research and Technology (SMART) and Massachussets Institute of Technology) | Jaillet, Patrick (Massachussets Institute of Technology)

AAAI ConferencesJul-14-2014

In this paper, we solve cooperative decentralized stochastic planning problems, where the interactions between agents (specified using transition and reward functions) are dependent on the number of agents (and not on the identity of the individual agents) involved in the interaction. A collision of robots in a narrow corridor, defender teams coordinating patrol activities to secure a target, etc. are examples of such anonymous interactions. Formally, we consider problems that are a subset of the well known Decentralized MDP (DEC-MDP) model, where the anonymity in interactions is specified within the joint reward and transition functions. In this paper, not only do we introduce a general model model called D-SPAIT to capture anonymity in interactions, but also provide optimization based optimal and local-optimal solutions for generalizable sub-categories of D-SPAIT.

artificial intelligence, number of agents, optimization problem, (12 more...)

AAAI Conferences

Twenty-Eighth AAAI Conference on Artificial Intelligence

Country: North America > United States > Massachusetts (0.14)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.38)

Add feedback

Filters

Collaborating Authors

Jaillet, Patrick

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

Gaussian Process Planning with Lipschitz Continuous Reward Functions: Towards Unifying Bayesian Optimization, Active Learning, and Beyond

Inverse Reinforcement Learning with Locally Consistent Reward Functions

Gaussian Process Planning with Lipschitz Continuous Reward Functions: Towards Unifying Bayesian Optimization, Active Learning, and Beyond

Dynamic Redeployment to Counter Congestion or Starvation in Vehicle Sharing Systems

Parallel Gaussian Process Regression for Big Data: Low-Rank Representation Meets Markov Approximation

Solving Uncertain MDPs with Objectives that Are Separable over Instantiations of Model Uncertainty

Parallel Gaussian Process Regression for Big Data: Low-Rank Representation Meets Markov Approximation

Parallel Gaussian Process Regression with Low-Rank Covariance Matrix Approximations

Decentralized Data Fusion and Active Sensing with Mobile Sensors for Modeling and Predicting Spatiotemporal Traffic Phenomena

Decentralized Stochastic Planning with Anonymity in Interactions