Country
Integrating Deep Reinforcement Learning with Model-based Path Planners for Automated Driving
Yurtsever, Ekim, Capito, Linda, Redmill, Keith, Ozguner, Umit
Automated driving in urban settings is challenging chiefly due to the indeterministic nature of the human participants of the traffic. These behaviors are difficult to model, and conventional, rule-based Automated Driving Systems (ADSs) tend to fail when they face unmodeled dynamics. On the other hand, the more recent, end-to-end Deep Reinforcement Learning (DRL) based ADSs have shown promising results. However, pure learning-based approaches lack the hard-coded safety measures of model-based methods. Here we propose a hybrid approach that integrates a model-based path planner into a vision based DRL framework to alleviate the shortcomings of both worlds. In summary, the DRL agent learns to overrule the model-based planner's decisions if it predicts that better future rewards can be obtained while doing so, e.g., avoiding an accident. Otherwise, the DRL agent tends to follow the model-based planner as close as possible. This logic is learned, i.e., no switching model is designed here. The agent learns this by considering two penalties: the penalty of straying away from the model-based path planner and the penalty of having a collision. The latter has precedence over the former, i.e., the penalty is greater. Therefore, after training, the agent learns to follow the model-based planner when it is safe to do so, otherwise, it gets penalized. However, it also learns to sacrifice positive rewards for following the model-based planner to avoid a potential big negative penalty for making a collision in the future. Experimental results show that the proposed method can plan its path and navigate while avoiding obstacles between randomly chosen origin-destination points in CARLA, a dynamic urban simulation environment. Our code is open-source and available online.
An Experimental Study of Formula Embeddings for Automated Theorem Proving in First-Order Logic
Abdelaziz, Ibrahim, Thost, Veronika, Crouse, Maxwell, Fokoue, Achille
Automated theorem proving in first-order logic is an active research area which is successfully supported by machine learning. While there have been various proposals for encoding logical formulas into numerical vectors -- from simple strings to much more involved graph-based embeddings --, little is known about how these different encodings compare. In this paper, we study and experimentally compare pattern-based embeddings that are applied in current systems with popular graph-based encodings, most of which have not been considered in the theorem proving context before. Our experiments show that some graph-based encodings help finding much shorter proofs and may yield better performance in terms of number of completed proofs. However, as expected, a detailed analysis shows the trade-offs in terms of runtime.
A Survey on Knowledge Graphs: Representation, Acquisition and Applications
Ji, Shaoxiong, Pan, Shirui, Cambria, Erik, Marttinen, Pekka, Yu, Philip S.
Human knowledge provides a formal understanding of the world. Knowledge graphs that represent structural relations between entities have become an increasingly popular research direction towards cognition and human-level intelligence. In this survey, we provide a comprehensive review on knowledge graph covering overall research topics about 1) knowledge graph representation learning, 2) knowledge acquisition and completion, 3) temporal knowledge graph, and 4) knowledge-aware applications, and summarize recent breakthroughs and perspective directions to facilitate future research. We propose a full-view categorization and new taxonomies on these topics. Knowledge graph embedding is organized from four aspects of representation space, scoring function, encoding models and auxiliary information. For knowledge acquisition, especially knowledge graph completion, embedding methods, path inference and logical rule reasoning are reviewed. We further explore several emerging topics including meta relational learning, commonsense reasoning, and temporal knowledge graphs. To facilitate future research on knowledge graphs, we also provide a curated collection of datasets and open-source libraries on different tasks. In the end, we have a thorough outlook on several promising research directions.
Solving Billion-Scale Knapsack Problems
Zhang, Xingwen, Qi, Feng, Hua, Zhigang, Yang, Shuang
Knapsack problems (KPs) are common in industry, but solving KPs is known to be NP-hard and has been tractable only at a relatively small scale. This paper examines KPs in a slightly generalized form and shows that they can be solved nearly optimally at scale via distributed algorithms. The proposed approach can be implemented fairly easily with off-the-shelf distributed computing frameworks (e.g. MPI, Hadoop, Spark). As an example, our implementation leads to one of the most efficient KP solvers known to date -- capable to solve KPs at an unprecedented scale (e.g., KPs with 1 billion decision variables and 1 billion constraints can be solved within 1 hour). The system has been deployed to production and called on a daily basis, yielding significant business impacts at Ant Financial.
Bertrand-DR: Improving Text-to-SQL using a Discriminative Re-ranker
Kelkar, Amol, Relan, Rohan, Bhardwaj, Vaishali, Vaichal, Saurabh, Relan, Peter
To access data stored in relational databases, users need to understand the database schema and write a query using a query language such as SQL. To simplify this task, text-to-SQL models attempt to translate a user's natural language question to corresponding SQL query. Recently, several generative text-to-SQL models have been developed. We propose a novel discriminative re-ranker to improve the performance of generative text-to-SQL models by extracting the best SQL query from the beam output predicted by the text-to-SQL generator, resulting in improved performance in the cases where the best query was in the candidate list, but not at the top of the list. We build the re-ranker as a schema agnostic BERT fine-tuned classifier. We analyze relative strengths of the text-to-SQL and re-ranker models across different query hardness levels, and suggest how to combine the two models for optimal performance. We demonstrate the effectiveness of the re-ranker by applying it to two state-of-the-art text-to-SQL models, and achieve top 4 score on the Spider leaderboard at the time of writing this article.
Robust saliency maps with decoy-enhanced saliency score
Lu, Yang, Guo, Wenbo, Xing, Xinyu, Noble, William Stafford
Saliency methods help to make deep neural network predictions more interpretable by identifying particular features, such as pixels in an image, that contribute most strongly to the network's prediction. Unfortunately, recent evidence suggests that many saliency methods perform poorly when gradients are saturated or in the presence of strong inter-feature dependence or noise injected by an adversarial attack. In this work, we propose to infer robust saliency scores by integrating the saliency scores of a set of decoys with a novel decoy-enhanced saliency score, in which the decoys are generated by either solving an optimization problem or blurring the original input. We theoretically analyze that our method compensates for gradient saturation and considers joint activation patterns of pixels. We also apply our method to three different CNNs---VGGNet, AlexNet, and ResNet trained on ImageNet data set. The empirical results show both qualitatively and quantitatively that our method outperforms raw scores produced by three existing saliency methods, even in the presence of adversarial attacks.
DYNOTEARS: Structure Learning from Time-Series Data
Pamfil, Roxana, Sriwattanaworachai, Nisara, Desai, Shaan, Pilgerstorfer, Philip, Beaumont, Paul, Georgatzis, Konstantinos, Aragam, Bryon
In this paper, we revisit the structure learning problem for dynamic Bayesian networks and propose a method that simultaneously estimates contemporaneous (intra-slice) and time-lagged (inter-slice) relationships between variables in a time-series. Our approach is score-based, and revolves around minimizing a penalized loss subject to an acyclicity constraint. To solve this problem, we leverage a recent algebraic result characterizing the acyclicity constraint as a smooth equality constraint. The resulting algorithm, which we call DYNOTEARS, outperforms other methods on simulated data, especially in high-dimensions as the number of variables increases. We also apply this algorithm on real datasets from two different domains, finance and molecular biology, and analyze the resulting output. Compared to state-of-the-art methods for learning dynamic Bayesian networks, our method is both scalable and accurate on real data. The simple formulation, and competitive performance of our method make it suitable for a variety of problems where one seeks to learn connections between variables across time.
Accelerating Cooperative Planning for Automated Vehicles with Learned Heuristics and Monte Carlo Tree Search
Kurzer, Karl, Fechner, Marcus, Zรถllner, J. Marius
-- Efficient driving in urban traffic scenarios requires foresight. The observation of other traffic participants, and the inference of their possible next actions depending on the own action is considered cooperative prediction and planning. Humans are well equipped with the capability to predict the actions of multiple interacting traffic participants and plan accordingly, without the need to directly communicate with others. Prior work has shown that it is possible to achieve effective cooperative planning without the need for explicit communication. However, the search space for cooperative plans is so large that the vast amount of the computational budget is spent on exploring the search space in unpromising regions that are far away from the solution. T o accelerate the planning process, we combined learned heuristics with a cooperative planning method in order to guide the search towards regions with promising actions, yielding better results at lower computational costs. Cooperative planning methods consider the mutual dependence of actions in multi-agent environments, opposed to methods that reduce multi-agent environments to single-agent environments, with other agents' action being independent of one another.
Active Learning for Identification of Linear Dynamical Systems
Wagenmaker, Andrew, Jamieson, Kevin
We propose an algorithm to actively estimate the parameters of a linear dynamical system. Given complete control over the system's input, our algorithm adaptively chooses the inputs to accelerate estimation. We show a finite time bound quantifying the estimation rate our algorithm attains and prove matching upper and lower bounds which guarantee its asymptotic optimality, up to constants. In addition, we show that this optimal rate is unattainable when using Gaussian noise to excite the system, even with optimally tuned covariance, and analyze several examples where our algorithm provably improves over rates obtained by playing noise. Our analysis critically relies on a novel result quantifying the error in estimating the parameters of a dynamical system when arbitrary periodic inputs are being played. We conclude with numerical examples that illustrate the effectiveness of our algorithm in practice.
WeatherBench: A benchmark dataset for data-driven weather forecasting
Rasp, Stephan, Dueben, Peter D., Scher, Sebastian, Weyn, Jonathan A., Mouatadid, Soukayna, Thuerey, Nils
Data-driven approaches, most prominently deep learning, have become powerful tools for prediction in many domains. A natural question to ask is whether data-driven methods could also be used for numerical weather prediction. First studies show promise but the lack of a common dataset and evaluation metrics make inter-comparison between studies difficult. Here we present a benchmark dataset for data-driven medium-range weather forecasting, a topic of high scientific interest for atmospheric and computer scientists alike. We provide data derived from the ERA5 archive that has been processed to facilitate the use in machine learning models. We propose a simple and clear evaluation metric which will enable a direct comparison between different methods. Further, we provide baseline scores from simple linear regression techniques, deep learning models as well as purely physical forecasting models. All data is publicly available and the companion code is reproducible with tutorials for getting started. We hope that this dataset will accelerate research in data-driven weather forecasting.