Goto

Collaborating Authors

 preference and constraint



Learner-aware Teaching: Inverse Reinforcement Learning with Preferences and Constraints

Neural Information Processing Systems

Inverse reinforcement learning (IRL) enables an agent to learn complex behavior by observing demonstrations from a (near-)optimal policy. The typical assumption is that the learner's goal is to match the teacher's demonstrated behavior. In this paper, we consider the setting where the learner has its own preferences that it additionally takes into consideration. These preferences can for example capture behavioral biases, mismatched worldviews, or physical constraints. We study two teaching approaches: learner-agnostic teaching, where the teacher provides demonstrations from an optimal policy ignoring the learner's preferences, and learner-aware teaching, where the teacher accounts for the learner's preferences. We design learner-aware teaching algorithms and show that significant performance improvements can be achieved over learner-agnostic teaching.



Reviews: Learner-aware Teaching: Inverse Reinforcement Learning with Preferences and Constraints

Neural Information Processing Systems

This paper formalizes the problem of inverse reinforcement learning in which the learner's goal is not only to imitate the teacher's demonstration, but also to satisfy her own preferences and constraints. It analyzes the suboptimality of learner-agnostic teaching, where the teacher gives demonstrations without considering the learner's preferences. It then proposes a learner-aware teaching algorithm, where the teacher selects demonstrations while accounting for the learner's preferences. It considers different types of learner models with hard or soft preference constraints. It also develops learner-aware teaching methods for both cases where the teacher has full knowledge of the learner's constraints or does not know it.


Reviews: Learner-aware Teaching: Inverse Reinforcement Learning with Preferences and Constraints

Neural Information Processing Systems

The paper proposes a really interesting and novel variant of inverse RL with a nice formalization. The proposed algorithms are suitable. While the reviewers felt that the empirical results were weak (lack of scalability and linear reward function limitation), they thought that this was outweighed by the novelty of the problem and the significance of the contribution.


Learner-aware Teaching: Inverse Reinforcement Learning with Preferences and Constraints

Neural Information Processing Systems

Inverse reinforcement learning (IRL) enables an agent to learn complex behavior by observing demonstrations from a (near-)optimal policy. The typical assumption is that the learner's goal is to match the teacher's demonstrated behavior. In this paper, we consider the setting where the learner has its own preferences that it additionally takes into consideration. These preferences can for example capture behavioral biases, mismatched worldviews, or physical constraints. We study two teaching approaches: learner-agnostic teaching, where the teacher provides demonstrations from an optimal policy ignoring the learner's preferences, and learner-aware teaching, where the teacher accounts for the learner's preferences.


Learner-aware Teaching: Inverse Reinforcement Learning with Preferences and Constraints

Neural Information Processing Systems

Inverse reinforcement learning (IRL) enables an agent to learn complex behavior by observing demonstrations from a (near-)optimal policy. The typical assumption is that the learner's goal is to match the teacher's demonstrated behavior. In this paper, we consider the setting where the learner has its own preferences that it additionally takes into consideration. These preferences can for example capture behavioral biases, mismatched worldviews, or physical constraints. We study two teaching approaches: learner-agnostic teaching, where the teacher provides demonstrations from an optimal policy ignoring the learner's preferences, and learner-aware teaching, where the teacher accounts for the learner's preferences.


Learner-aware Teaching: Inverse Reinforcement Learning with Preferences and Constraints

arXiv.org Artificial Intelligence

Inverse reinforcement learning (IRL) enables an agent to learn complex behavior by observing demonstrations from a (near-)optimal policy. The typical assumption is that the learner's goal is to match the teacher's demonstrated behavior. In this paper, we consider the setting where the learner has her own preferences that she additionally takes into consideration. These preferences can for example capture behavioral biases, mismatched worldviews, or physical constraints. We study two teaching approaches: learner-agnostic teaching, where the teacher provides demonstrations from an optimal policy ignoring the learner's preferences, and learner-aware teaching, where the teacher accounts for the learner's preferences. We design learner-aware teaching algorithms and show that significant performance improvements can be achieved over learner-agnostic teaching.


Attendee-Sourcing: Exploring The Design Space of Community-Informed Conference Scheduling

AAAI Conferences

Constructing a good conference schedule for a large multi-track conference needs to take into account the preferences and constraints of organizers, authors, and attendees. Creating a schedule which has fewer conflicts for authors and attendees, and thematically coherent sessions is a challenging task. Cobi introduced an alternative approach to conference scheduling by engaging the community to play an active role in the planning process. The current Cobi pipeline consists of committee-sourcing and author-sourcing to plan a conference schedule. We further explore the design space of community-sourcing by introducing attendee-sourcing -- a process that collects input from conference attendees and encodes them as preferences and constraints for creating sessions and schedule. For CHI 2014, a large multi-track conference in human-computer interaction with more than 3,000 attendees and 1,000 authors, we collected attendees’ preferences by making available all the accepted papers at the conference on a paper recommendation tool we built called Confer, for a period of 45 days before announcing the conference program (sessions and schedule). We compare the preferences marked on Confer with the preferences collected from Cobi’s author-sourcing approach. We show that attendee-sourcing can provide insights beyond what can be discovered by author-sourcing. For CHI 2014, the results show value in the method and attendees’ participation. It produces data that provides more alternatives in scheduling and complements data collected from other methods for creating coherent sessions and reducing conflicts.


Towards Grammars for Cradle-to-Cradle Design

AAAI Conferences

Figure 1a first illustrates by the oval that a Cradle-to-cradle (C2C) design (McDonough & Braungart, critical problem in traditional design is that a product is designed 2002) recognizes that nothing short of full recycling of materials in isolation. In contrast, the products shown in the with no degradation in material quality is necessary square box of Figure 1b illustrate the concept of a product for long-term planet sustainability. C2C advocates looking family, where multiple products are designed within a system to the natural world as an ideal model of recycling, where of material use and reuse, which flows between product organic materials are continually recycled through processes lines. While there may still be materials that come from of decay and growth. They propose design methodology outside the family and there are materials that are byproducts that separates biological cycles and syntheticmaterial of the family production, a family design would seek cycles, enabling biological material to be reclaimed to minimize these and to exploit them in a still larger context.