Goto

Collaborating Authors

 Search


Online Learning of Network Bottlenecks via Minimax Paths

arXiv.org Machine Learning

The path-specific Another commonly used method for these problems is bottleneck on a path between a source and a target node in a Upper Confidence Bound (UCB) (Auer 2002), which utilizes network is defined as the edge with a maximal cost or weight optimism to balance exploration and exploitation. UCB according to some criterion such as transfer time, load, commute has been adapted to combinatorial settings (Chen, Wang, and time, distance, etc. Then, the goal of bottleneck identification Yuan 2013), and also exists in Bayesian variants (Kaufmann, and avoidance is to find a path whose bottleneck is Cappé, and Garivier 2012). Recently, a variant of UCB has minimal. Thus, one may model bottleneck identification as been studied for bottleneck avoidance problems in a combinatorial the problem of computing the minimax edge over the given pure exploration setting (Du, Kuroki, and Chen network/graph, to obtain an edge with a minimal largest gap 2021). They consider a different problem setting and method between the source and target nodes. Equivalently, it can be than ours, though their bottleneck reward function is similar formulated as a widest path problem or maximum capacity to the one we use in our approximation method.


Information-theoretic Classification Accuracy: A Criterion that Guides Data-driven Combination of Ambiguous Outcome Labels in Multi-class Classification

arXiv.org Machine Learning

Outcome labeling ambiguity and subjectivity are ubiquitous in real-world datasets. While practitioners commonly combine ambiguous outcome labels in an ad hoc way to improve the accuracy of multi-class classification, there lacks a principled approach to guide label combination by any optimality criterion. To address this problem, we propose the information-theoretic classification accuracy (ITCA), a criterion of outcome "information" conditional on outcome prediction, to guide practitioners on how to combine ambiguous outcome labels. ITCA indicates a balance in the trade-off between prediction accuracy (how well do predicted labels agree with actual labels) and prediction resolution (how many labels are predictable). To find the optimal label combination indicated by ITCA, we develop two search strategies: greedy search and breadth-first search. Notably, ITCA and the two search strategies are adaptive to all machine-learning classification algorithms. Coupled with a classification algorithm and a search strategy, ITCA has two uses: to improve prediction accuracy and to identify ambiguous labels. We first verify that ITCA achieves high accuracy with both search strategies in finding the correct label combinations on synthetic and real data. Then we demonstrate the effectiveness of ITCA in diverse applications including medical prognosis, cancer survival prediction, user demographics prediction, and cell type classification.


How to Split a Logic Program

arXiv.org Artificial Intelligence

Answer Set Programming (ASP) is a successful method for solving a range of real-world applications. Despite the availability of fast ASP solvers, computing answer sets demands a very large computational power, since the problem tackled is in the second level of the polynomial hierarchy. A speed-up in answer set computation may be attained, if the program can be split into two disjoint parts, bottom and top. Thus, the bottom part is evaluated independently of the top part, and the results of the bottom part evaluation are used to simplify the top part. Lifschitz and Turner have introduced the concept of a splitting set, i.e., a set of atoms that defines the splitting. In this paper, We show that the problem of computing a splitting set with some desirable properties can be reduced to a classic Search Problem and solved in polynomial time. This allows us to conduct experiments on the size of the splitting set in various programs and lead to an interesting discovery of a source of complication in stable model computation. We also show that for Head-Cycle-Free programs, the definition of splitting sets can be adjusted to allow splitting of a broader class of programs.


CompilerGym: Robust, Performant Compiler Optimization Environments for AI Research

arXiv.org Artificial Intelligence

Interest in applying Artificial Intelligence (AI) techniques to compiler optimizations is increasing rapidly, but compiler research has a high entry barrier. Unlike in other domains, compiler and AI researchers do not have access to the datasets and frameworks that enable fast iteration and development of ideas, and getting started requires a significant engineering investment. What is needed is an easy, reusable experimental infrastructure for real world compiler optimization tasks that can serve as a common benchmark for comparing techniques, and as a platform to accelerate progress in the field. We introduce CompilerGym, a set of environments for real world compiler optimization tasks, and a toolkit for exposing new optimization tasks to compiler researchers. CompilerGym enables anyone to experiment on production compiler optimization problems through an easy-to-use package, regardless of their experience with compilers. We build upon the popular OpenAI Gym interface enabling researchers to interact with compilers using Python and a familiar API. We describe the CompilerGym architecture and implementation, characterize the optimization spaces and computational efficiencies of three included compiler environments, and provide extensive empirical evaluations. Compared to prior works, CompilerGym offers larger datasets and optimization spaces, is 27x more computationally efficient, is fault-tolerant, and capable of detecting reproducibility bugs in the underlying compilers. In making it easy for anyone to experiment with compilers - irrespective of their background - we aim to accelerate progress in the AI and compiler research domains.


Efficient and Effective Generation of Test Cases for Pedestrian Detection -- Search-based Software Testing of Baidu Apollo in SVL

arXiv.org Artificial Intelligence

With the growing capabilities of autonomous vehicles, there is a higher demand for sophisticated and pragmatic quality assurance approaches for machine learning-enabled systems in the automotive AI context. The use of simulation-based prototyping platforms provides the possibility for early-stage testing, enabling inexpensive testing and the ability to capture critical corner-case test scenarios. Simulation-based testing properly complements conventional on-road testing. However, due to the large space of test input parameters in these systems, the efficient generation of effective test scenarios leading to the unveiling of failures is a challenge. This paper presents a study on testing pedestrian detection and emergency braking system of the Baidu Apollo autonomous driving platform within the SVL simulator. We propose an evolutionary automated test generation technique that generates failure-revealing scenarios for Apollo in the SVL environment. Our approach models the input space using a generic and flexible data structure and benefits a multi-criteria safety-based heuristic for the objective function targeted for optimization. This paper presents the results of our proposed test generation technique in the 2021 IEEE Autonomous Driving AI Test Challenge. In order to demonstrate the efficiency and effectiveness of our approach, we also report the results from a baseline random generation technique. Our evaluation shows that the proposed evolutionary test case generator is more effective at generating failure-revealing test cases and provides higher diversity between the generated failures than the random baseline.


A Semi-exact Algorithm for Quickly Computing A Maximum Weight Clique in Large Sparse Graphs

Journal of Artificial Intelligence Research

This paper explores techniques to quickly solve the maximum weight clique problem (MWCP) in very large scale sparse graphs. Due to their size, and the hardness of MWCP, it is infeasible to solve many of these graphs with exact algorithms. Although recent heuristic algorithms make progress in solving MWCP in large graphs, they still need considerable time to get a high-quality solution. In this work, we focus on solving MWCP for large sparse graphs within a short time limit. We propose a new method for MWCP which interleaves clique finding with data reduction rules. We propose novel ideas to make this process efficient, and develop an algorithm called FastWClq. Experiments on a broad range of large sparse graphs show that FastWClq finds better solutions than state-of-the-art algorithms while the running time of FastWClq is much shorter than the competitors for most instances. Further, FastWClq proves the optimality of its solutions for roughly half of the graphs, all with at least 105 vertices, with an average time of 21 seconds.


Barzilai and Borwein conjugate gradient method equipped with a non-monotone line search technique and its application on non-negative matrix factorization

arXiv.org Artificial Intelligence

In this paper, we propose a new non-monotone conjugate gradient method for solving unconstrained nonlinear optimization problems. We first modify the non-monotone line search method by introducing a new trigonometric function to calculate the non-monotone parameter, which plays an essential role in the algorithm's efficiency. Then, we apply a convex combination of the Barzilai-Borwein method for calculating the value of step size in each iteration. Under some suitable assumptions, we prove that the new algorithm has the global convergence property. The efficiency and effectiveness of the proposed method are determined in practice by applying the algorithm to some standard test problems and non-negative matrix factorization problems.


Completeness of Unbounded Best-First Game Algorithms

arXiv.org Artificial Intelligence

In this article, we prove the completeness of the following game search algorithms: unbounded best-first minimax with completion and descent with completion, i.e. we show that, with enough time, they find the best game strategy. We then generalize these two algorithms in the context of perfect information multiplayer games. We show that these generalizations are also complete: they find one of the equilibrium points.


Towards Improving Adversarial Training of NLP Models

arXiv.org Artificial Intelligence

Adversarial training, a method for learning robust deep neural networks, constructs adversarial examples during training. However, recent methods for generating NLP adversarial examples involve combinatorial search and expensive sentence encoders for constraining the generated instances. As a result, it remains challenging to use vanilla adversarial training to improve NLP models' performance, and the benefits are mainly uninvestigated. This paper proposes a simple and improved vanilla adversarial training process for NLP models, which we name Attacking to Training (A2T). The core part of A2T is a new and cheaper word substitution attack optimized for vanilla adversarial training. We use A2T to train BERT and RoBERTa models on IMDB, Rotten Tomatoes, Yelp, and SNLI datasets. Our results empirically show that it is possible to train robust NLP models using a much cheaper adversary. We demonstrate that vanilla adversarial training with A2T can improve an NLP model's robustness to the attack it was originally trained with and also defend the model against other types of word substitution attacks. Furthermore, we show that A2T can improve NLP models' standard accuracy, cross-domain generalization, and interpretability. Code is available at https://github.com/QData/Textattack-A2T .


Estimation of Local Average Treatment Effect by Data Combination

arXiv.org Machine Learning

It is important to estimate the local average treatment effect (LATE) when compliance with a treatment assignment is incomplete. The previously proposed methods for LATE estimation required all relevant variables to be jointly observed in a single dataset; however, it is sometimes difficult or even impossible to collect such data in many real-world problems for technical or privacy reasons. We consider a novel problem setting in which LATE, as a function of covariates, is nonparametrically identified from the combination of separately observed datasets. For estimation, we show that the direct least squares method, which was originally developed for estimating the average treatment effect under complete compliance, is applicable to our setting. However, model selection and hyperparameter tuning for the direct least squares estimator can be unstable in practice since it is defined as a solution to the minimax problem. We then propose a weighted least squares estimator that enables simpler model selection by avoiding the minimax objective formulation. Unlike the inverse probability weighted (IPW) estimator, the proposed estimator directly uses the pre-estimated weight without inversion, avoiding the problems caused by the IPW methods. We demonstrate the effectiveness of our method through experiments using synthetic and real-world datasets.