Klenk, Matthew
Att-Adapter: A Robust and Precise Domain-Specific Multi-Attributes T2I Diffusion Adapter via Conditional Variational Autoencoder
Cho, Wonwoong, Chen, Yan-Ying, Klenk, Matthew, Inouye, David I., Zhang, Yanxia
Text-to-Image (T2I) Diffusion Models have achieved remarkable performance in generating high quality images. However, enabling precise control of continuous attributes, especially multiple attributes simultaneously, in a new domain (e.g., numeric values like eye openness or car width) with text-only guidance remains a significant challenge. To address this, we introduce the Attribute (Att) Adapter, a novel plug-and-play module designed to enable fine-grained, multi-attributes control in pretrained diffusion models. Our approach learns a single control adapter from a set of sample images that can be unpaired and contain multiple visual attributes. The Att-Adapter leverages the decoupled cross attention module to naturally harmonize the multiple domain attributes with text conditioning. We further introduce Conditional Variational Autoencoder (CVAE) to the Att-Adapter to mitigate overfitting, matching the diverse nature of the visual world. Evaluations on two public datasets show that Att-Adapter outperforms all LoRA-based baselines in controlling continuous attributes. Additionally, our method enables a broader control range and also improves disentanglement across multiple attributes, surpassing StyleGAN-based techniques. Notably, Att-Adapter is flexible, requiring no paired synthetic data for training, and is easily scalable to multiple attributes within a single model.
Learning to Operate in Open Worlds by Adapting Planning Models
Piotrowski, Wiktor, Stern, Roni, Sher, Yoni, Le, Jacob, Klenk, Matthew, deKleer, Johan, Mohan, Shiwali
Planning agents are ill-equipped to act in novel situations in which their domain model no longer accurately represents the world. We introduce an approach for such agents operating in open worlds that detects the presence of novelties and effectively adapts their domain models and consequent action selection. It uses observations of action execution and measures their divergence from what is expected, according to the environment model, to infer existence of a novelty. Then, it revises the model through a heuristics-guided search over model changes. We report empirical evaluations on the CartPole problem, a standard Reinforcement Learning (RL) benchmark. The results show that our approach can deal with a class of novelties very quickly and in an interpretable fashion.
Playing Angry Birds with a Domain-Independent PDDL+ Planner
Piotrowski, Wiktor, Stern, Roni, Klenk, Matthew, Perez, Alexandre, Mohan, Shiwali, de Kleer, Johan, Le, Jacob
This demo paper presents the first system for playing the popular Angry Birds game using a domain-independent planner. Our system models Angry Birds levels using PDDL+, a planning language for mixed discrete/continuous domains. It uses a domain-independent PDDL+ planner to generate plans and executes them. In this demo paper, we present the system's PDDL+ model for this domain, identify key design decisions that reduce the problem complexity, and compare the performance of our system to model-specific methods for this domain. The results show that our system's performance is on par with other domain-specific systems for Angry Birds, suggesting the applicability of domain-independent planning to this benchmark AI challenge.
An Extensible and Personalizable Multi-Modal Trip Planner
Liu, Xudong, Fritz, Christian, Klenk, Matthew
Despite a tremendous amount of work in the literature and in the commercial sectors, current approaches to multi-modal trip planning still fail to consistently generate plans that users deem optimal in practice. We believe that this is due to the fact that current planners fail to capture the true preferences of users, e.g., their preferences depend on aspects that are not modeled. An example of this could be a preference not to walk through an unsafe area at night. We present a novel multi-modal trip planner that allows users to upload auxiliary geographic data (e.g., crime rates) and to specify temporal constraints and preferences over these data in combination with typical metrics such as time and cost. Concretely, our planner supports the modes walking, biking, driving, public transit, and taxi, uses linear temporal logic to capture temporal constraints, and preferential cost functions to represent preferences. We show by examples that this allows the expression of very interesting preferences and constraints that, naturally, lead to quite diverse optimal plans.
Acceptable Planning: Influencing Individual Behavior to Reduce Transportation Energy Expenditure of a City
Mohan, Shiwali, Rakha, Hesham, Klenk, Matthew
Palo Alto Research Center, Mail Stop: 3333 Coyote Hill Road, Palo Alto, CA 94034 USA Abstract Our research aims at developing intelligent systems to reduce the transportation-related energy expenditure of a large city by influencing individual behavior. We introduce Copter - an intelligent travel assistant that evaluates multi-modal travel alternatives to find a plan that is acceptable to a person given their context and preferences. We propose a formulation for acceptable planning that brings together ideas from AI, machine learning, and economics. This formulation has been incorporated in Copter that produces acceptable plans in real-time. We adopt a novel empirical evaluation framework that combines human decision data with a high fidelity multi-modal transportation simulation to demonstrate a 4% energy reduction and 20% delay reduction in a realistic deployment scenario in Los Angeles, California, USA. 1. Introduction Transportation is one of the largest consumers of energy in the ...
Towards a Cognitive System that Can Recognize Spatial Regions Based on Context
Hawes, Nick (University of Birmingham) | Klenk, Matthew (Palo Alto Research Center) | Lockwood, Kate (California State University, Monterey Bay) | Horn, Graham S. (University of Birmingham) | Kelleher, John D (Dublin Institute of Technology)
In order to collaborate with people in the real world, cognitive systems must be able to represent and reason about spatial regions in human environments. Consider the command "go to the front of the classroom". The spatial region mentioned (the front of the classroom) is not perceivable using geometry alone. Instead it is defined by its functional use, implied by nearby objects and their configuration. In this paper, we define such areas as context-dependent spatial regions and present a cognitive system able to learn them by combining qualitative spatial representations, semantic labels, and analogy. The system is capable of generating a collection of qualitative spatial representations describing the configuration of the entities it perceives in the world. It can then be taught context-dependent spatial regions using anchor pointsdefined on these representations. From this we then demonstrate how an existing computational model of analogy can be used to detect context-dependent spatial regions in previously unseen rooms. To evaluate this process we compare detected regions to annotations made on maps of real rooms by human volunteers.
Representing and Reasoning About Spatial Regions Defined by Context
Klenk, Matthew (Palo Alto Research Center) | Hawes, Nick (University of Birmingham) | Lockwood, Kate (California State University, Monterey Bay)
In order to collaborate with people in the real world, cognitive systems must be able to represent and reason about spatial regions in human environments. Consider the command "go to the front of the classroom". The spatial region mentioned (the front of the classroom) is not perceivable using geometry alone. Instead it is defined by its functional use, implied by nearby objects and their configuration. In this paper, we define such areas as context-dependent spatial regions and propose a method for a cognitive system to learn them incrementally by combining qualitative spatial representations, semantic labels, and analogy. Using data from a mobile robot, we generate a relational representation of semantically labeled objects and their configuration. Next, we show how the boundary of a context-dependent spatial region can be defined using anchor points. Finally, we demonstrate how an existing computational model of analogy can be used to transfer this region to a new situation.
The Case for Case-Based Transfer Learning
Klenk, Matthew (Navy Center for Applied Research in Artificial Intelligence) | Aha, David W. (Navy Center for Applied Research in Artificial Intelligence) | Molineaux, Matt (Knexus Research Corporation)
Transfer learning occurs when, after gaining experience from learning how to solve source problems, the same learner exploits this experience to improve performance and/or learning on target problems. In transfer learning, the differences between the source and target problems characterize the transfer distance. CBR can support transfer learning methods in multiple ways. We illustrate how CBR and transfer learning interact and characterize three approaches for using CBR in transfer learning: (1) as a transfer learning method, (2) for problem learning, and (3) to transfer knowledge between sets of problems.
The Case for Case-Based Transfer Learning
Klenk, Matthew (Navy Center for Applied Research in Artificial Intelligence) | Aha, David W. (Navy Center for Applied Research in Artificial Intelligence) | Molineaux, Matt (Knexus Research Corporation)
Case-based reasoning (CBR) is a problem-solving process in which a new problem is solved by retrieving a similar situation and reusing its solution. Transfer learning occurs when, after gaining experience from learning how to solve source problems, the same learner exploits this experience to improve performance and/or learning on target problems. In transfer learning, the differences between the source and target problems characterize the transfer distance. CBR can support transfer learning methods in multiple ways. We illustrate how CBR and transfer learning interact and characterize three approaches for using CBR in transfer learning: (1) as a transfer learning method, (2) for problem learning, and (3) to transfer knowledge between sets of problems. We describe examples of these approaches from our own and related work and discuss applicable transfer distances for each. We close with conclusions and directions for future research applying CBR to transfer learning.
Planning in Dynamic Environments: Extending HTNs with Nonlinear Continuous Effects
Molineaux, Matthew (Knexus Research Corporation) | Klenk, Matthew (Naval Research Laboratory) | Aha, David (Naval Research Laboratory)
Planning in dynamic continuous environments requires reasoning about nonlinear continuous effects, which previous Hierarchical Task Network (HTN) planners do not support. In this paper, we extend an existing HTN planner with a new state projection algorithm. To our knowledge, this is the first HTN planner that can reason about nonlinear continuous effects. We use a wait action to instruct this planner to consider continuous effects in a given state. We also introduce a new planning domain to demonstrate the benefits of planning with nonlinear continuous effects. We compare our approach with a linear continuous effects planner and a discrete effects HTN planner on a benchmark domain, which reveals that its additional costs are largely mitigated by domain knowledge. Finally, we present an initial application of this algorithm in a practical domain, a Navy training simulation, illustrating the utility of this approach for planning in dynamic continuous environments.