Kraus, Sarit


Leveraging human knowledge in tabular reinforcement learning: A study of human subjects

arXiv.org Artificial Intelligence

Reinforcement Learning (RL) can be extremely effective in solving complex, real-world problems. However, injecting human knowledge into an RL agent may require extensive effort and expertise on the human designer's part. To date, human factors are generally not considered in the development and evaluation of possible RL approaches. In this article, we set out to investigate how different methods for injecting human knowledge are applied, in practice, by human designers of varying levels of knowledge and skill. We perform the first empirical evaluation of several methods, including a newly proposed method named SASS which is based on the notion of similarities in the agent's state-action space. Through this human study, consisting of 51 human participants, we shed new light on the human factors that play a key role in RL. We find that the classical reward shaping technique seems to be the most natural method for most designers, both expert and non-expert, to speed up RL. However, we further find that our proposed method SASS can be effectively and efficiently combined with reward shaping, and provides a beneficial alternative to using only a single speedup method with minimal human designer effort overhead.


Augmenting Decisions of Taxi Drivers through Reinforcement Learning for Improving Revenues

AAAI Conferences

Taxis (which include cars working with car aggregation systems such as Uber, Grab, Lyft etc.) have become a critical component in the urban transportation. While most research and applications in the context of taxis have focused on improving performance from a customer perspective, in this paper, we focus on improving performance from a taxi driver perspective. Higher revenues for taxi drivers can help bring more drivers into the system thereby improving availability for customers in dense urban cities. Typically, when there is no customer on board, taxi drivers will cruise around to find customers either directly (on the street) or indirectly (due to a request from a nearby customer on phone or on aggregation systems). For such cruising taxis, we develop a Reinforcement Learning (RL) based system to learn from real trajectory logs of drivers to advise them on the right locations to find customers which maximize their revenue. There are multiple translational challenges involved in building this RL system based on real data, such as annotating the activities (e.g., roaming, going to a taxi stand, etc.) observed in trajectory logs, identifying the right features for a state, action space and evaluating against real driver performance observed in the dataset. We also provide a dynamic abstraction mechanism to improve the basic learning mechanism. Finally, we provide a thorough evaluation on a real world data set from a developed Asian city and demonstrate that an RL based system can provide significant benefits to the drivers.


Psychologically Based Virtual-Suspect for Interrogative Interview Training

AAAI Conferences

In this paper, we present a Virtual-Suspect system which can be used to train inexperienced law enforcement personnel in interrogation strategies. The system supports different scenario configurations based on historical data. The responses presented by the Virtual-Suspect are selected based on the psychological state of the suspect, which can be configured as well. Furthermore, each interrogator's statement affects the Virtual-Suspect's current psychological state, which may lead the interrogation in different directions. In addition, the model takes into account the context in which the statements are made. Experiments with 24 subjects demonstrate that the Virtual-Suspect's behavior is similar to that of a human who plays the role of the suspect.


Personalized Alert Agent for Optimal User Performance

AAAI Conferences

Preventive maintenance is essential for the smooth operation of any equipment. Still, people occasionally do not maintain their equipment adequately. Maintenance alert systems attempt to remind people to perform maintenance. However, most of these systems do not provide alerts at the optimal timing, and nor do they take into account the time required for maintenance or compute the optimal timing for a specific user. We model the problem of maintenance performance, assuming maintenance is time consuming. We solve the optimal policy for the user, i.e., the optimal timing for a user to perform maintenance. This optimal strategy depends on the value of user's time, and thus it may vary from user to user and may change over time. %We present a game Based on the solved optimal strategy we present a personalized maintenance agent, which, depending on the value of user's time, provides alerts to the user when she should perform maintenance. In an experiment using a spaceship computer game, we show that receiving alerts from the personalized alert agent significantly improves user performance.


Advice Provision for Energy Saving in Automobile Climate-Control System

AI Magazine

Reducing energy consumption of climate control systems is important in order to reduce human environmental footprint. Our approach takes into account both the energy consumption of the climate control system and the expected comfort level of the driver. We therefore build two models, one for assessing the energy consumption of the climate control system as a function of the system's settings, and the other, models human comfort level as a function of the climate control system's settings. Using these models, the agent provides advice to the driver considering how to set the climate control system.


Advice Provision for Energy Saving in Automobile Climate-Control System

AI Magazine

Reducing energy consumption of climate control systems is important in order to reduce human environmental footprint. The need to save energy becomes even greater when considering an electric car, since heavy use of the climate control system may exhaust the battery. In this article we consider a method for an automated agent to provide advice to drivers which will motivate them to reduce the energy consumption of their climate control unit. Our approach takes into account both the energy consumption of the climate control system and the expected comfort level of the driver. We therefore build two models, one for assessing the energy consumption of the climate control system as a function of the system’s settings, and the other, models human comfort level as a function of the climate control system’s settings. Using these models, the agent provides advice to the driver considering how to set the climate control system. The agent advises settings which try to preserve a high level of comfort while consuming as little energy as possible. We empirically show that drivers equipped with our agent which provides them with advice significantly save energy as compared to drivers not equipped with our agent.


Intelligent Agents for Rehabilitation and Care of Disabled and Chronic Patients

AAAI Conferences

The number of people with disabilities is continuously increasing. Providing patients who have disabilities with the rehabilitation and care necessary to allow them good quality of life creates overwhelming demands for health and rehabilitation services. We suggest that advancements in intelligent agent technology provide new opportunities for improving the provided services. We will discuss the challenges of building an agent for the health care domain and present four capabilities that are required for an agent in the health care domain: planning, monitoring, intervention and encouragement. We will discuss the importance of personalizing all of them and the needto facilitate cooperation between the automated agent and the human care givers. We will review recent technology that can be used toward the development of agents that can have these capabilities and their promise in automating services such as physiotherapy, speech therapy and cognitive training.


Providing Arguments in Discussions Based on the Prediction of Human Argumentative Behavior

AAAI Conferences

Argumentative discussion is a highly demanding task. In order to help people in such situations, this paper provides an innovative methodology for developing an agent that can support people in argumentative discussions by proposing possible arguments to them. By analyzing more than 130 human discussions and 140 questionnaires, answered by people, we show that the well-established Argumentation Theory is not a good predictor of people's choice of arguments. Then, we present a model that has 76% accuracy when predicting people’s top three argument choices given a partial deliberation. We present the Predictive and Relevance based Heuristic agent (PRH), which uses this model with a heuristic that estimates the relevance of possible arguments to the last argument given in order to propose possible arguments. Through extensive human studies with over 200 human subjects, we show that people’s satisfaction from the PRH agent is significantly higher than from other agents that propose arguments based on Argumentation Theory, predict arguments without the heuristics or only the heuristics. People also use the PRH agent's proposed arguments significantly more often than those proposed by the other agents.


A Hybrid Approach of Classifier and Clustering for Solving the Missing Node Problem

AAAI Conferences

An important area of social network research is identifying missing information which is not explicitly represented in the network or is not visible to all. In this paper, we propose a novel Hybrid Approach of Classifier and Clustering,a which we refer to as HACC, to solve the missing node identification problem in social networks. HACC utilizes a classifier as a preprocessing step in order to integrate all known information into one similarity measure and then uses a clustering algorithm to identify missing nodes. Specifically, we used the information on the network structure, attributes about known users (nodes) and pictorial information to evaluate HACC and found that it performs significantly better than other missing node algorithms. We also argue that HACC is a general approach and domain independent and can be easily applied to other domains. We support this claim by evaluating HACC on a second authorship identification domain as well.


Adaptive Advice in Automobile Climate Control Systems

AAAI Conferences

Reducing an automobile's energy consumption will lower its dependency on fossil fuel and extend the travel range of electric vehicles. Automobile Climate Control Systems (CCS) are known to be heavy energy consumers. To help reduce CCS energy consumption, this paper presents an adaptive automated agent, MDP Agent for Climate control Systems -- MACS, which provides drivers advice as to how to set their CCS. First, we present a model which has 78% accuracy in predicting drivers' reactions to different advice in different situations. Using the prediction model, we designed a Markov Decision Process which solution provided the advising policy for MACS. Through empirical evaluation using an electric car, with 83 human subjects, we show that MACS successfully reduced the energy consumption of the subjects by 33% compared to subjects who were not equipped with MACS. MACS also outperformed the state-of-the-art Social agent for Advice Provision (SAP).