Goto

Collaborating Authors

 wcd




CAKD: A Correlation-Aware Knowledge Distillation Framework Based on Decoupling Kullback-Leibler Divergence

Zhang, Zao, Chen, Huaming, Ning, Pei, Yang, Nan, Yuan, Dong

arXiv.org Machine Learning

In knowledge distillation, a primary focus has been on transforming and balancing multiple distillation components. In this work, we emphasize the importance of thoroughly examining each distillation component, as we observe that not all elements are equally crucial. From this perspective,we decouple the Kullback-Leibler (KL) divergence into three unique elements: Binary Classification Divergence (BCD), Strong Correlation Divergence (SCD), and Weak Correlation Divergence (WCD). Each of these elements presents varying degrees of influence. Leveraging these insights, we present the Correlation-Aware Knowledge Distillation (CAKD) framework. CAKD is designed to prioritize the facets of the distillation components that have the most substantial influence on predictions, thereby optimizing knowledge transfer from teacher to student models. Our experiments demonstrate that adjusting the effect of each element enhances the effectiveness of knowledge transformation. Furthermore, evidence shows that our novel CAKD framework consistently outperforms the baseline across diverse models and datasets. Our work further highlights the importance and effectiveness of closely examining the impact of different parts of distillation process.


Data-Driven Goal Recognition Design for General Behavioral Agents

Kasumba, Robert, Yu, Guanghui, Ho, Chien-Ju, Keren, Sarah, Yeoh, William

arXiv.org Artificial Intelligence

Goal recognition design aims to make limited modifications to decision-making environments with the goal of making it easier to infer the goals of agents acting within those environments. Although various research efforts have been made in goal recognition design, existing approaches are computationally demanding and often assume that agents are (near-)optimal in their decision-making. To address these limitations, we introduce a data-driven approach to goal recognition design that can account for agents with general behavioral models. Following existing literature, we use worst-case distinctiveness($\textit{wcd}$) as a measure of the difficulty in inferring the goal of an agent in a decision-making environment. Our approach begins by training a machine learning model to predict the $\textit{wcd}$ for a given environment and the agent behavior model. We then propose a gradient-based optimization framework that accommodates various constraints to optimize decision-making environments for enhanced goal recognition. Through extensive simulations, we demonstrate that our approach outperforms existing methods in reducing $\textit{wcd}$ and enhancing runtime efficiency in conventional setup. Moreover, our approach also adapts to settings in which existing approaches do not apply, such as those involving flexible budget constraints, more complex environments, and suboptimal agent behavior. Finally, we have conducted human-subject experiments which confirm that our method can create environments that facilitate efficient goal recognition from real-world human decision-makers.


Generalising Planning Environment Redesign

Pozanco, Alberto, Pereira, Ramon Fraga, Borrajo, Daniel

arXiv.org Artificial Intelligence

In Environment Design, one interested party seeks to affect another agent's decisions by applying changes to the environment. Most research on planning environment (re)design assumes the interested party's objective is to facilitate the recognition of goals and plans, and search over the space of environment modifications to find the minimal set of changes that simplify those tasks and optimise a particular metric. This search space is usually intractable, so existing approaches devise metric-dependent pruning techniques for performing search more efficiently. This results in approaches that are not able to generalise across different objectives and/or metrics. In this paper, we argue that the interested party could have objectives and metrics that are not necessarily related to recognising agents' goals or plans. Thus, to generalise the task of Planning Environment Redesign, we develop a general environment redesign approach that is metric-agnostic and leverages recent research on top-quality planning to efficiently redesign planning environments according to any interested party's objective and metric. Experiments over a set of environment redesign benchmarks show that our general approach outperforms existing approaches when using well-known metrics, such as facilitating the recognition of goals, as well as its effectiveness when solving environment redesign tasks that optimise a novel set of different metrics.


Information Theoretical Importance Sampling Clustering

Zhang, Jiangshe, Ji, Lizhen, Wang, Meng

arXiv.org Artificial Intelligence

A current assumption of most clustering methods is that the training data and future data are taken from the same distribution. However, this assumption may not hold in most real-world scenarios. In this paper, we propose an information theoretical importance sampling based approach for clustering problems (ITISC) which minimizes the worst case of expected distortions under the constraint of distribution deviation. The distribution deviation constraint can be converted to the constraint over a set of weight distributions centered on the uniform distribution derived from importance sampling. The objective of the proposed approach is to minimize the loss under maximum degradation hence the resulting problem is a constrained minimax optimization problem which can be reformulated to an unconstrained problem using the Lagrange method. The optimization problem can be solved by both an alternative optimization algorithm or a general optimization routine by commercially available software. Experiment results on synthetic datasets and a real-world load forecasting problem validate the effectiveness of the proposed model. Furthermore, we show that fuzzy c-means is a special case of ITISC with the logarithmic distortion, and this observation provides an interesting physical interpretation for fuzzy exponent $m$.


How does Weight Correlation Affect the Generalisation Ability of Deep Neural Networks

Jin, Gaojie, Yi, Xinping, Zhang, Liang, Zhang, Lijun, Schewe, Sven, Huang, Xiaowei

arXiv.org Artificial Intelligence

This paper studies the novel concept of weight correlation in deep neural networks and discusses its impact on the networks' generalisation ability. For fully-connected layers, the weight correlation is defined as the average cosine similarity between weight vectors of neurons, and for convolutional layers, the weight correlation is defined as the cosine similarity between filter matrices. Theoretically, we show that, weight correlation can, and should, be incorporated into the PAC Bayesian framework for the generalisation of neural networks, and the resulting generalisation bound is monotonic with respect to the weight correlation. We formulate a new complexity measure, which lifts the PAC Bayes measure with weight correlation, and experimentally confirm that it is able to rank the generalisation errors of a set of networks more precisely than existing measures. More importantly, we develop a new regulariser for training, and provide extensive experiments that show that the generalisation error can be greatly reduced with our novel approach.


Goal Recognition Design in Deterministic Environments

Keren, Sarah, Gal, Avigdor, Karpas, Erez

Journal of Artificial Intelligence Research

Goal recognition design (GRD) facilitates understanding the goals of acting agents through the analysis and redesign of goal recognition models, thus offering a solution for assessing and minimizing the maximal progress of any agent in the model before goal recognition is guaranteed. In a nutshell, given a model of a domain and a set of possible goals, a solution to a GRD problem determines (1) the extent to which actions performed by an agent within the model reveal the agent’s objective; and (2) how best to modify the model so that the objective of an agent can be detected as early as possible. This approach is relevant to any domain in which rapid goal recognition is essential and the model design can be controlled. Applications include intrusion detection, assisted cognition, computer games, and human-robot collaboration. A GRD problem has two components: the analyzed goal recognition setting, and a design model specifying the possible ways the environment in which agents act can be modified so as to facilitate recognition. This work formulates a general framework for GRD in deterministic and partially observable environments, and offers a toolbox of solutions for evaluating and optimizing model quality for various settings. For the purpose of evaluation we suggest the worst case distinctiveness (WCD) measure, which represents the maximal cost of a path an agent may follow before its goal can be inferred by a goal recognition system. We offer novel compilations to classical planning for calculating WCD in settings where agents are bounded-suboptimal. We then suggest methods for minimizing WCD by searching for an optimal redesign strategy within the space of possible modifications, and using pruning to increase efficiency. We support our approach with an empirical evaluation that measures WCD in a variety of GRD settings and tests the efficiency of our compilation-based methods for computing it. We also examine the effectiveness of reducing WCD via redesign and the performance gain brought about by our proposed pruning strategy.


Plan Recognition Design

Mirsky, Reuth (Ben-Gurion University of the Negev) | Stern, Roni (Ben-Gurion University of the Negev) | Gal, Ya' (Ben-Gurion University of the Negev) | akov (Ben-Gurion University of the Negev) | Kalech, Meir

AAAI Conferences

Goal Recognition Design (GRD) is the problem of designing a domain in a way that will allow easy identification of agents' goals. This work extends the original GRD problem to the Plan Recognition Design (PRD) problem which is the task of designing a domain using plan libraries in order to facilitate fast identification of an agent's plan. While GRD can help to explain faster which goal the agent is trying to achieve, PRD can help in faster understanding of how the agent is going to achieve its goal. we define a new measure that quantifies the worst-case distinctiveness of a given planning domain, propose a method to reduce it in a given domain and show the reduction of this new measure in three domains from the literature.


Solving Goal Recognition Design Using ASP

Son, Tran Cao (New Mexico State University) | Sabuncu, Orkunt (University of Potsdam) | Schulz-Hanke, Christian (University of Potsdam) | Schaub, Torsten (University of Potsdam) | Yeoh, William (New Mexico State University)

AAAI Conferences

Goal Recognition Design involves identifying the best ways to modify an underlying environment that agents operate in, typically by making asubset of feasible actions infeasible, so that agents are forced to reveal their goals as early as possible. Thus far, existing work has focused exclusively on imperative classical planning. In this paper, we address the same problem with a different paradigm, namely, declarative approaches based on Answer Set Programming (ASP). Our experimental results show that one of our ASP encodings is more scalable and is significantly faster by up to three orders of magnitude than thecurrent state of the art.