Goto

Collaborating Authors

 Search


Programming by Rewards

arXiv.org Artificial Intelligence

We formalize and study ``programming by rewards'' (PBR), a new approach for specifying and synthesizing subroutines for optimizing some quantitative metric such as performance, resource utilization, or correctness over a benchmark. A PBR specification consists of (1) input features $x$, and (2) a reward function $r$, modeled as a black-box component (which we can only run), that assigns a reward for each execution. The goal of the synthesizer is to synthesize a "decision function" $f$ which transforms the features to a decision value for the black-box component so as to maximize the expected reward $E[r \circ f (x)]$ for executing decisions $f(x)$ for various values of $x$. We consider a space of decision functions in a DSL of loop-free if-then-else programs, which can branch on linear functions of the input features in a tree-structure and compute a linear function of the inputs in the leaves of the tree. We find that this DSL captures decision functions that are manually written in practice by programmers. Our technical contribution is the use of continuous-optimization techniques to perform synthesis of such decision functions as if-then-else programs. We also show that the framework is theoretically-founded ---in cases when the rewards satisfy nice properties, the synthesized code is optimal in a precise sense. We have leveraged PBR to synthesize non-trivial decision functions related to search and ranking heuristics in the PROSE codebase (an industrial strength program synthesis framework) and achieve competitive results to manually written procedures over multiple man years of tuning. We present empirical evaluation against other baseline techniques over real-world case studies (including PROSE) as well on simple synthetic benchmarks.


Learning Combinatorial Optimization on Graphs: A Survey with Applications to Networking

arXiv.org Artificial Intelligence

Existing approaches to solving combinatorial optimization problems on graphs suffer from the need to engineer each problem algorithmically, with practical problems recurring in many instances. The practical side of theoretical computer science, such as computational complexity, then needs to be addressed. Relevant developments in machine learning research on graphs are surveyed for this purpose. We organize and compare the structures involved with learning to solve combinatorial optimization problems, with a special eye on the telecommunications domain and its continuous development of live and research networks.


Relaxing the I.I.D. Assumption: Adaptive Minimax Optimal Sequential Prediction with Expert Advice

arXiv.org Machine Learning

We consider sequential prediction with expert advice when the data are generated stochastically, but the distributions generating the data may vary arbitrarily among some constraint set. We quantify relaxations of the classical I.I.D. assumption in terms of possible constraint sets, with I.I.D. at one extreme, and an adversarial mechanism at the other. The Hedge algorithm, long known to be minimax optimal in the adversarial regime, has recently been shown to also be minimax optimal in the I.I.D. setting. We show that Hedge is suboptimal between these extremes, and present a new algorithm that is adaptively minimax optimal with respect to our relaxations of the I.I.D. assumption, without knowledge of which setting prevails.


Bottom-up mechanism and improved contract net protocol for the dynamic task planning of heterogeneous Earth observation resources

arXiv.org Artificial Intelligence

Earth observation resources are becoming increasingly indispensable in disaster relief, damage assessment and related domains. Many unpredicted factors, such as the change of observation task requirements, to the occurring of bad weather and resource failures, may cause the scheduled observation scheme to become infeasible. Therefore, it is crucial to be able to promptly and maybe frequently develop high-quality replanned observation schemes that minimize the effects on the scheduled tasks. A bottom-up distributed coordinated framework together with an improved contract net are proposed to facilitate the dynamic task replanning for heterogeneous Earth observation resources. This hierarchical framework consists of three levels, namely, neighboring resource coordination, single planning center coordination, and multiple planning center coordination. Observation tasks affected by unpredicted factors are assigned and treated along with a bottom-up route from resources to planning centers. This bottom-up distributed coordinated framework transfers part of the computing load to various nodes of the observation systems to allocate tasks more efficiently and robustly. To support the prompt assignment of large-scale tasks to proper Earth observation resources in dynamic environments, we propose a multiround combinatorial allocation (MCA) method. Moreover, a new float interval-based local search algorithm is proposed to obtain the promising planning scheme more quickly. The experiments demonstrate that the MCA method can achieve a better task completion rate for large-scale tasks with satisfactory time efficiency. It also demonstrates that this method can help to efficiently obtain replanning schemes based on original scheme in dynamic environments.


Probability Learning based Tabu Search for the Budgeted Maximum Coverage Problem

arXiv.org Artificial Intelligence

Knapsack problems are classic models that can formulate a wide range of applications. In this work, we deal with the Budgeted Maximum Coverage Problem (BMCP), which is a generalized 0-1 knapsack problem. Given a set of items with nonnegative weights and a set of elements with nonnegative profits, where each item is composed of a subset of elements, BMCP aims to pack a subset of items in a capacity-constrained knapsack such that the total weight of the selected items does not exceed the knapsack capacity, and the total profit of the associated elements is maximized. Note that each element is counted once even if it is covered multiple times. BMCP is closely related to the Set-Union Knapsack Problem (SUKP) that is well studied in recent years. As the counterpart problem of SUKP, however, BMCP was introduced early in 1999 but since then it has been rarely studied, especially there is no practical algorithm proposed. By combining the reinforcement learning technique to the local search procedure, we propose a probability learning based tabu search (PLTS) algorithm for addressing this NP-hard problem. The proposed algorithm iterates through two distinct phases, namely a tabu search phase and a probability learning based perturbation phase. As there is no benchmark instances proposed in the literature, we generate 30 benchmark instances with varied properties. Experimental results demonstrate that our PLTS algorithm significantly outperforms the general CPLEX solver for solving the challenging BMCP in terms of the solution quality.


A Survey of Algorithms for Black-Box Safety Validation

arXiv.org Artificial Intelligence

Autonomous and semi-autonomous systems for safety-critical applications require rigorous testing before deployment. Due to the complexity of these systems, formal verification may be impossible and real-world testing may be dangerous during development. Therefore, simulation-based techniques have been developed that treat the system under test as a black box during testing. Safety validation tasks include finding disturbances to the system that cause it to fail (falsification), finding the most-likely failure, and estimating the probability that the system fails. Motivated by the prevalence of safety-critical artificial intelligence, this work provides a survey of state-of-the-art safety validation techniques with a focus on applied algorithms and their modifications for the safety validation problem. We present and discuss algorithms in the domains of optimization, path planning, reinforcement learning, and importance sampling. Problem decomposition techniques are presented to help scale algorithms to large state spaces, and a brief overview of safety-critical applications is given, including autonomous vehicles and aircraft collision avoidance systems. Finally, we present a survey of existing academic and commercially available safety validation tools.


Polestar: An Intelligent, Efficient and National-Wide Public Transportation Routing Engine

arXiv.org Artificial Intelligence

Public transportation plays a critical role in people's daily life. It has been proven that public transportation is more environmentally sustainable, efficient, and economical than any other forms of travel. However, due to the increasing expansion of transportation networks and more complex travel situations, people are having difficulties in efficiently finding the most preferred route from one place to another through public transportation systems. To this end, in this paper, we present Polestar, a data-driven engine for intelligent and efficient public transportation routing. Specifically, we first propose a novel Public Transportation Graph (PTG) to model public transportation system in terms of various travel costs, such as time or distance. Then, we introduce a general route search algorithm coupled with an efficient station binding method for efficient route candidate generation. After that, we propose a two-pass route candidate ranking module to capture user preferences under dynamic travel situations. Finally, experiments on two real-world data sets demonstrate the advantages of Polestar in terms of both efficiency and effectiveness. Indeed, in early 2019, Polestar has been deployed on Baidu Maps, one of the world's largest map services. To date, Polestar is servicing over 330 cities, answers over a hundred millions of queries each day, and achieves substantial improvement of user click ratio.


Half-checking propagators

arXiv.org Artificial Intelligence

Propagators are central to the success of constraint programming, that is contracting functions removing values proven not to be in any solution of a given constraint. The literature contains numerous propagation algorithms, for many different constraints, and common to all these propagation algorithms is the notion of correctness: only values that appear in no solution to the respective constraint may be removed. In this paper half-checking propagators are introduced, for which the only requirements are that identified solutions (by the propagators) are actual solutions (to the corresponding constraints), and that the propagators are contracting. In particular, a half-checking propagator may remove solutions resulting in an incomplete solving process, but with the upside that (good) solutions may be found faster. Overall completeness can be obtained by running half-checking propagators as one component in a portfolio solving process. Half-checking propagators opens up a wider variety of techniques to be used when designing propagation algorithms, compared to what is currently available. A formal model for half-checking propagators is introduced, together with a detailed description of how to support such propagators in a constraint programming system. Three general directions for creating half-checking propagation algorithms are introduced, and used for designing new half-checking propagators for the cost-circuit constraint as examples. The new propagators are implemented in the Gecode system.


Learning Generalized Relational Heuristic Networks for Model-Agnostic Planning

arXiv.org Artificial Intelligence

Computing goal-directed behavior (sequential decision-making, or planning) is essential to designing efficient AI systems. Due to the computational complexity of planning, current approaches rely primarily upon hand-coded symbolic domain models and hand-coded heuristic-function generators for efficiency. Learned heuristics for such problems have been of limited utility as they are difficult to apply to problems with objects and object quantities that are significantly different from those in the training data. This paper develops a new approach for learning generalized heuristics in the absence of symbolic domain models using deep neural networks that utilize an input predicate vocabulary but are agnostic to object names and quantities. It uses an abstract state representation to facilitate data efficient, generalizable learning. Empirical evaluation on a range of benchmark domains show that in contrast to prior approaches, generalized heuristics computed by this method can be transferred easily to problems with different objects and with object quantities much larger than those in the training data.


Solving the Clustered Traveling Salesman Problem via TSP methods

arXiv.org Artificial Intelligence

The Clustered Traveling Salesman Problem (CTSP) is a variant of the popular Traveling Salesman Problem (TSP) arising from a number of real-life applications. In this work, we explore an uncharted solution approach that solves the CTSP by transforming it to the well-studied TSP. For this purpose, we first investigate a technique to convert a CTSP instance to a TSP and then apply popular TSP solvers (including exact and heuristic solvers) to solve the resulting TSP instance. We want to answer the following questions: How do state-of-the-art TSP solvers perform on clustered instances converted from the CTSP? Do state-of-the-art TSP solvers compete well with the best performing methods specifically designed for the CTSP? For this purpose, we present intensive computational experiments on various CTSP benchmark instances to draw conclusions.