Goto

Collaborating Authors

University of Texas at Austin


Chatterjee

AAAI Conferences

Partially observable Markov decision processes (POMDPs) are widely used in probabilistic planning problems in which an agent interacts with an environment using noisy and imprecise sensors. We study a setting in which the sensors are only partially defined and the goal is to synthesize "weakest" additional sensors, such that in the resulting POMDP, there is a small-memory policy for the agent that almost-surely (with probability 1) satisfies a reachability objective. We show that the problem is NP-complete, and present a symbolic algorithm by encoding the problem into SAT instances. We illustrate trade-offs between the amount of memory of the policy and the number of additional sensors on a simple example. We have implemented our approach and consider three classical POMDP examples from the literature, and show that in all the examples the number of sensors can be significantly decreased (as compared to the existing solutions in the literature) without increasing the complexity of the policies.


Sensor Synthesis for POMDPs with Reachability Objectives

AAAI Conferences

Partially observable Markov decision processes (POMDPs) are widely used in probabilistic planning problems in which an agent interacts with an environment using noisy and imprecise sensors. We study a setting in which the sensors are only partially defined and the goal is to synthesize “weakest” additional sensors, such that in the resulting POMDP, there is a small-memory policy for the agent that almost-surely (with probability 1) satisfies a reachability objective. We show that the problem is NP-complete, and present a symbolic algorithm by encoding the problem into SAT instances. We illustrate trade-offs between the amount of memory of the policy and the number of additional sensors on a simple example. We have implemented our approach and consider three classical POMDP examples from the literature, and show that in all the examples the number of sensors can be significantly decreased (as compared to the existing solutions in the literature) without increasing the complexity of the policies.


Natural Language Processing and Program Analysis for Supporting Todo Comments as Software Evolves

AAAI Conferences

Natural language elements (e.g., API comments, todo comments) form a substantial part of software repositories. While developers routinely use many natural language elements (e.g., todo comments) for communication, the semantic content of these elements is often neglected by software engineering techniques and tools. Additionally, as software evolves and development teams re-organize, these natural language elements are frequently forgotten, or just become outdated, imprecise and irrelevant. We envision several techniques, which combine natural language processing and program analysis, to help developers maintain their todo comments. Specifically, we propose techniques to synthesize code from comments, make comments executable, answer questions in comments, improve comment quality, and detect dangling comments.


DIPD: Gaze-Based Intention Inference in Dynamic Environments

AAAI Conferences

The ability of an autonomous system to understand something about a human's intent is important to the success of many systems that involve both humans and autonomous agents. In this work, we consider the specific setting of a human passenger riding in an autonomous vehicle, where the passenger intends to go to or learn about a specific point of interest along the vehicle's route. In this setting, we seek to provide the vehicle with the ability to infer this point of interest using real-time gaze information. This is a difficult problem in that the inference must be designed in the context of the moving vehicle, i.e., in a dynamic environment with dynamic interest points. We propose here a solution to this problem via a novel methodology called Dynamic Interest Point Detection (DIPD) for inferring the point of interest corresponding to the human's intent using gaze tracking data and a dynamic Markov Random Field (MRF) model. The energy function we develop allows the algorithm to successfully filter out noise from the eye tracker, such as eye blinks, high-speed tracking misalignment, and other sources of error. We demonstrate the success of this DIPD technique experimentally and show that it achieves up to a 28% increase in inference success compared to a nearest-neighbor approach.


Nie

AAAI Conferences

While developers routinely use many natural language elements (e.g., todo comments) for communication, the semantic content of these elements is often neglected by software engineering techniques and tools. Additionally, as software evolves and development teams re-organize, these natural language elements are frequently forgotten, or just become outdated, imprecise and irrelevant. We envision several techniques, which combine natural language processing and program analysis, to help developers maintain their todo comments. Specifically, we propose techniques to synthesize code from comments, make comments executable, answer questions in comments, improve comment quality, and detect dangling comments.


Jiang

AAAI Conferences

The ability of an autonomous system to understand something about a human's intent is important to the success of many systems that involve both humans and autonomous agents. In this work, we consider the specific setting of a human passenger riding in an autonomous vehicle, where the passenger intends to go to or learn about a specific point of interest along the vehicle's route. In this setting, we seek to provide the vehicle with the ability to infer this point of interest using real-time gaze information. This is a difficult problem in that the inference must be designed in the context of the moving vehicle, i.e., in a dynamic environment with dynamic interest points. We propose here a solution to this problem via a novel methodology called Dynamic Interest Point Detection (DIPD) for inferring the point of interest corresponding to the human's intent using gaze tracking data and a dynamic Markov Random Field (MRF) model.


DyETC: Dynamic Electronic Toll Collection for Traffic Congestion Alleviation

AAAI Conferences

To alleviate traffic congestion in urban areas, electronic toll collection (ETC) systems are deployed all over the world. Despite the merits, tolls are usually pre-determined and fixed from day to day, which fail to consider traffic dynamics and thus have limited regulation effect when traffic conditions are abnormal. In this paper, we propose a novel dynamic ETC (DyETC) scheme which adjusts tolls to traffic conditions in realtime. The DyETC problem is formulated as a Markov decision process (MDP), the solution of which is very challenging due to its 1) multi-dimensional state space, 2) multi-dimensional, continuous and bounded action space, and 3) time-dependent state and action values. Due to the complexity of the formulated MDP, existing methods cannot be applied to our problem. Therefore, we develop a novel algorithm, PG-beta, which makes three improvements to traditional policy gradient method by proposing 1) time-dependent value and policy functions, 2) Beta distribution policy function and 3) state abstraction. Experimental results show that, compared with existing ETC schemes, DyETC increases traffic volume by around 8%, and reduces travel time by around 14:6% during rush hour. Considering the total traffic volume in a traffic network, this contributes to a substantial increase to social welfare.


Adversarial Goal Generation for Intrinsic Motivation

AAAI Conferences

Generally in Reinforcement Learning the goal, or reward signal, is given by the environment and cannot be controlled by the agent. We propose to introduce an intrinsic motivation module that will select a reward function for the agent to learn to achieve. We will use a Universal Value Function Approximator, that takes as input both the state and the parameters of this reward function as the goal to predict the value function (or action-value function) to generalize across these goals. This module will be trained to generate goals such that the agent's learning is maximized. Thus, this is also a method for automatic curriculum learning.


Guiding Exploratory Behaviors for Multi-Modal Grounding of Linguistic Descriptions

AAAI Conferences

A major goal of grounded language learning research is to enable robots to connect language predicates to a robot's physical interactive perception of the world. Coupling object exploratory behaviors such as grasping, lifting, and looking with multiple sensory modalities (e.g., audio, haptics, and vision) enables a robot to ground non-visual words like ``heavy'' as well as visual words like ``red''. A major limitation of existing approaches to multi-modal language grounding is that a robot has to exhaustively explore training objects with a variety of actions when learning a new such language predicate. This paper proposes a method for guiding a robot's behavioral exploration policy when learning a novel predicate based on known grounded predicates and the novel predicate's linguistic relationship to them. We demonstrate our approach on two datasets in which a robot explored large sets of objects and was tasked with learning to recognize whether novel words applied to those objects.


Li

AAAI Conferences

We present a novel method for frequentist statistical inference in M-estimation problems, based on stochastic gradient descent (SGD) with a fixed step size: we demonstrate that the average of such SGD sequences can be used for statistical inference, after proper scaling. An intuitive analysis using the Ornstein-Uhlenbeck process suggests that such averages are asymptotically normal. To show the merits of our scheme, we apply it to both synthetic and real data sets, and demonstrate that its accuracy is comparable to classical statistical methods, while requiring potentially far less computation.