Search
Classification Performance Metric Elicitation and its Applications
Given a learning problem with real-world tradeoffs, which cost function should the model be trained to optimize? This is the metric selection problem in machine learning. Despite its practical interest, there is limited formal guidance on how to select metrics for machine learning applications. This thesis outlines metric elicitation as a principled framework for selecting the performance metric that best reflects implicit user preferences. Once specified, the evaluation metric can be used to compare and train models. In this manuscript, we formalize the problem of Metric Elicitation and devise novel strategies for eliciting classification performance metrics using pairwise preference feedback over classifiers. Specifically, we provide novel strategies for eliciting linear and linear-fractional metrics for binary and multiclass classification problems, which are then extended to a framework that elicits group-fair performance metrics in the presence of multiple sensitive groups. All the elicitation strategies that we discuss are robust to both finite sample and feedback noise, thus are useful in practice for real-world applications. Using the tools and the geometric characterizations of the feasible confusion statistics sets from the binary, multiclass, and multiclass-multigroup classification setups, we further provide strategies to elicit from a wider range of complex, modern multiclass metrics defined by quadratic functions of confusion statistics by exploiting their local linear structure. From application perspective, we also propose to use the metric elicitation framework in optimizing complex black box metrics that is amenable to deep network training. Lastly, to bring theory closer to practice, we conduct a preliminary real-user study that shows the efficacy of the metric elicitation framework in recovering the users' preferred performance metric in a binary classification setup.
A Route Network Planning Method for Urban Air Delivery
He, Xinyu, He, Fang, Li, Lishuai, Zhang, Lei, Xiao, Gang
High-tech giants and start-ups are investing in drone technologies to provide urban air delivery service, which is expected to solve the last-mile problem and mitigate road traffic congestion. However, air delivery service will not scale up without proper traffic management for drones in dense urban environment. Currently, a range of Concepts of Operations (ConOps) for unmanned aircraft system traffic management (UTM) are being proposed and evaluated by researchers, operators, and regulators. Among these, the tube-based (or corridor-based) ConOps has emerged in operations in some regions of the world for drone deliveries and is expected to continue serving certain scenarios that with dense and complex airspace and requires centralized control in the future. Towards the tube-based ConOps, we develop a route network planning method to design routes (tubes) in a complex urban environment in this paper. In this method, we propose a priority structure to decouple the network planning problem, which is NP-hard, into single-path planning problems. We also introduce a novel space cost function to enable the design of dense and aligned routes in a network. The proposed method is tested on various scenarios and compared with other state-of-the-art methods. Results show that our method can generate near-optimal route networks with significant computational time-savings.
Random Search Hyper-Parameter Tuning: Expected Improvement Estimation and the Corresponding Lower Bound
Navon, Dan, Bronstein, Alex M.
Hyperparameter tuning is a common technique for improving the performance of neural networks. Most techniques for hyperparameter search involve an iterated process where the model is retrained at every iteration. However, the expected accuracy improvement from every additional search iteration, is still unknown. Calculating the expected improvement can help create stopping rules for hyperparameter tuning and allow for a wiser allocation of a project's computational budget. In this paper, we establish an empirical estimate for the expected accuracy improvement from an additional iteration of hyperparameter search. Our results hold for any hyperparameter tuning method which is based on random search \cite{bergstra2012random} and samples hyperparameters from a fixed distribution. We bound our estimate with an error of $O\left(\sqrt{\frac{\log k}{k}}\right)$ w.h.p. where $k$ is the current number of iterations. To the best of our knowledge this is the first bound on the expected gain from an additional iteration of hyperparameter search. Finally, we demonstrate that the optimal estimate for the expected accuracy will still have an error of $\frac{1}{k}$.
Domain Knowledge in A*-Based Causal Discovery
Kleinegesse, Steven, Lawrence, Andrew R., Chockler, Hana
Causal discovery has become a vital tool for scientists and practitioners wanting to discover causal relationships from observational data. While most previous approaches to causal discovery have implicitly assumed that no expert domain knowledge is available, practitioners can often provide such domain knowledge from prior experience. Recent work has incorporated domain knowledge into constraint-based causal discovery. The majority of such constraint-based methods, however, assume causal faithfulness, which has been shown to be frequently violated in practice. Consequently, there has been renewed attention towards exact-search score-based causal discovery methods, which do not assume causal faithfulness, such as A*-based methods. However, there has been no consideration of these methods in the context of domain knowledge. In this work, we focus on efficiently integrating several types of domain knowledge into A*-based causal discovery. In doing so, we discuss and explain how domain knowledge can reduce the graph search space and then provide an analysis of the potential computational gains. We support these findings with experiments on synthetic and real data, showing that even small amounts of domain knowledge can dramatically speed up A*-based causal discovery and improve its performance and practicality.
Hybrid Learning with New Value Function for the Maximum Common Subgraph Problem
Liu, Yanli, Zhao, Jiming, Li, Chu-Min, Jiang, Hua, He, Kun
Maximum Common induced Subgraph (MCS) is an important NP-hard problem with wide real-world applications. Branch-and-Bound (BnB) is the basis of a class of efficient algorithms for MCS, consisting in successively selecting vertices to match and pruning when it is discovered that a solution better than the best solution found so far does not exist. The method of selecting the vertices to match is essential for the performance of BnB. In this paper, we propose a new value function and a hybrid selection strategy used in reinforcement learning to define a new vertex selection method, and propose a new BnB algorithm, called McSplitDAL, for MCS. Extensive experiments show that McSplitDAL significantly improves the current best BnB algorithms, McSplit+LL and McSplit+RL. An empirical analysis is also performed to illustrate why the new value function and the hybrid selection strategy are effective.
A Context-Aware Approach for Textual Adversarial Attack through Probability Difference Guided Beam Search
Liu, Huijun, Yu, Jie, Li, Shasha, Ma, Jun, Ji, Bin
Textual adversarial attacks expose the vulnerabilities of text classifiers and can be used to improve their robustness. Existing context-aware methods solely consider the gold label probability and use the greedy search when searching an attack path, often limiting the attack efficiency. To tackle these issues, we propose PDBS, a context-aware textual adversarial attack model using Probability Difference guided Beam Search. The probability difference is an overall consideration of all class label probabilities, and PDBS uses it to guide the selection of attack paths. In addition, PDBS uses the beam search to find a successful attack path, thus avoiding suffering from limited search space. Extensive experiments and human evaluation demonstrate that PDBS outperforms previous best models in a series of evaluation metrics, especially bringing up to a +19.5% attack success rate. Ablation studies and qualitative analyses further confirm the efficiency of PDBS.
Online 3D Bin Packing Reinforcement Learning Solution with Buffer
Puche, Aaron Valero, Lee, Sukhan
The 3D Bin Packing Problem (3D-BPP) is one of the most demanded yet challenging problems in industry, where an agent must pack variable size items delivered in sequence into a finite bin with the aim to maximize the space utilization. It represents a strongly NP-Hard optimization problem such that no solution has been offered to date with high performance in space utilization. In this paper, we present a new reinforcement learning (RL) framework for a 3D-BPP solution for improving performance. First, a buffer is introduced to allow multi-item action selection. By increasing the degree of freedom in action selection, a more complex policy that results in better packing performance can be derived. Second, we propose an agnostic data augmentation strategy that exploits both bin item symmetries for improving sample efficiency. Third, we implement a model-based RL method adapted from the popular algorithm AlphaGo, which has shown superhuman performance in zero-sum games. Our adaptation is capable of working in single-player and score based environments. In spite of the fact that AlphaGo versions are known to be computationally heavy, we manage to train the proposed framework with a single thread and GPU, while obtaining a solution that outperforms the state-of-the-art results in space utilization.
Non-Blocking Batch A* (Technical Report)
Veerapaneni, Rishi, Likhachev, Maxim
Heuristic search has traditionally relied on hand-crafted or programmatically derived heuristics. Neural networks (NNs) are newer powerful tools which can be used to learn complex mappings from states to cost-to-go heuristics. However, their slow single inference time is a large overhead that can substantially slow down planning time in optimized heuristic search implementations. Several recent works have described ways to take advantage of NN's batch computations to decrease overhead in planning, while retaining bounds on (sub)optimality. However, all these methods have used the NN heuristic in a "blocking" manner while building up their batches, and have ignored possible fast-to-compute admissible heuristics (e.g. existing classically derived heuristics) that are usually available to use. We introduce Non-Blocking Batch A* (NBBA*), a bounded suboptimal method which lazily computes the NN heuristic in batches while allowing expansions informed by a non-NN heuristic. We show how this subtle but important change can lead to substantial reductions in expansions compared to the current blocking alternative, and see that the performance is related to the information difference between the batch computed NN and fast non-NN heuristic.
A Multi-objective Memetic Algorithm for Auto Adversarial Attack Optimization Design
Sun, Jialiang, Yao, Wen, Jiang, Tingsong, Chen, Xiaoqian
The phenomenon of adversarial examples has been revealed in variant scenarios. Recent studies show that well-designed adversarial defense strategies can improve the robustness of deep learning models against adversarial examples. However, with the rapid development of defense technologies, it also tends to be more difficult to evaluate the robustness of the defensed model due to the weak performance of existing manually designed adversarial attacks. To address the challenge, given the defensed model, the efficient adversarial attack with less computational burden and lower robust accuracy is needed to be further exploited. Therefore, we propose a multi-objective memetic algorithm for auto adversarial attack optimization design, which realizes the automatical search for the near-optimal adversarial attack towards defensed models. Firstly, the more general mathematical model of auto adversarial attack optimization design is constructed, where the search space includes not only the attacker operations, magnitude, iteration number, and loss functions but also the connection ways of multiple adversarial attacks. In addition, we develop a multi-objective memetic algorithm combining NSGA-II and local search to solve the optimization problem. Finally, to decrease the evaluation cost during the search, we propose a representative data selection strategy based on the sorting of cross entropy loss values of each images output by models. Experiments on CIFAR10, CIFAR100, and ImageNet datasets show the effectiveness of our proposed method.
Simply Logical -- Intelligent Reasoning by Example (Fully Interactive Online Edition)
"Simply Logical -- Intelligent Reasoning by Example" by Peter Flach was first published by John Wiley in 1994. It could be purchased as book-only or with a 3.5 inch diskette containing the SWI-Prolog programmes printed in the book (for various operating systems). In 2007 the copyright reverted back to the author at which point the book and programmes were made freely available online; the print version is no longer distributed through John Wiley publishers. In 2015, as a pilot, we ported most of the original book into an online, interactive website using SWI-Prolog's SWISH platform. Since then, we launched the Simply Logical open source organisation committed to maintaining a suite of freely available interactive online educational resources about Artificial Intelligence and Logic Programming with Prolog. With the advent of new educational technologies we were inspired to rebuild the book from the ground up using the Jupyter Book platform enhanced with a collection of bespoke plugins that implement, among other things, interactive SWI-Prolog code blocks that can be executed directly in a web browser. This new version is more modular, easier to maintain, and can be split into custom teaching modules, in addition to being modern-looking, visually appealing, and compatible with a range of (mobile) devices of varying screen sizes.