Goto

Collaborating Authors

 Search


Building a Tree-Structured Parzen Estimator from Scratch (Kind Of)

#artificialintelligence

The way a machine learning model fits itself to data is governed by a set of initial conditions called hyperparameters. Hyperparameters help to restrict the learning behavior of a model so that it will (hopefully) be able to fit the data well and within a reasonable amount of time. Finding the best set of hyperparameters (often called "tuning") is one of the most important and time consuming parts of the modeling task. Historical approaches to hyperparameter tuning involve either a brute force or random search over a grid of hyperparameter combinations called Grid Search and Random Search, respectively. Although popular, Grid and Random Search methods lack any way of converging to a decent set of hyperparameters -- that is, they are purely trial and error.


GUTS: Generalized Uncertainty-Aware Thompson Sampling for Multi-Agent Active Search

arXiv.org Artificial Intelligence

Robotic solutions for quick disaster response are essential to ensure minimal loss of life, especially when the search area is too dangerous or too vast for human rescuers. We model this problem as an asynchronous multi-agent active-search task where each robot aims to efficiently seek objects of interest (OOIs) in an unknown environment. This formulation addresses the requirement that search missions should focus on quick recovery of OOIs rather than full coverage of the search region. Previous approaches fail to accurately model sensing uncertainty, account for occlusions due to foliage or terrain, or consider the requirement for heterogeneous search teams and robustness to hardware and communication failures. We present the Generalized Uncertainty-aware Thompson Sampling (GUTS) algorithm, which addresses these issues and is suitable for deployment on heterogeneous multi-robot systems for active search in large unstructured environments. We show through simulation experiments that GUTS consistently outperforms existing methods such as parallelized Thompson Sampling and exhaustive search, recovering all OOIs in 80% of all runs. In contrast, existing approaches recover all OOIs in less than 40% of all runs. We conduct field tests using our multi-robot system in an unstructured environment with a search area of approximately 75,000 sq. m. Our system demonstrates robustness to various failure modes, achieving full recovery of OOIs (where feasible) in every field run, and significantly outperforming our baseline.


Online augmentation of learned grasp sequence policies for more adaptable and data-efficient in-hand manipulation

arXiv.org Artificial Intelligence

When using a tool, the grasps used for picking it up, reposing, and holding it in a suitable pose for the desired task could be distinct. Therefore, a key challenge for autonomous in-hand tool manipulation is finding a sequence of grasps that facilitates every step of the tool use process while continuously maintaining force closure and stability. Due to the complexity of modeling the contact dynamics, reinforcement learning (RL) techniques can provide a solution in this continuous space subject to highly parameterized physical models. However, these techniques impose a trade-off in adaptability and data efficiency. At test time the tool properties, desired trajectory, and desired application forces could differ substantially from training scenarios. Adapting to this necessitates more data or computationally expensive online policy updates. In this work, we apply the principles of discrete dynamic programming (DP) to augment RL performance with domain knowledge. Specifically, we first design a computationally simple approximation of our environment. We then demonstrate in physical simulation that performing tree searches (i.e., lookaheads) and policy rollouts with this approximation can improve an RL-derived grasp sequence policy with minimal additional online computation. Additionally, we show that pretraining a deep RL network with the DP-derived solution to the discretized problem can speed up policy training.


Minimizing Running Buffers for Tabletop Object Rearrangement: Complexity, Fast Algorithms, and Applications

arXiv.org Artificial Intelligence

For rearranging objects on tabletops with overhand grasps, temporarily relocating objects to some buffer space may be necessary. This raises the natural question of how many simultaneous storage spaces, or "running buffers", are required so that certain classes of tabletop rearrangement problems are feasible. In this work, we examine the problem for both labeled and unlabeled settings. On the structural side, we observe that finding the minimum number of running buffers (MRB) can be carried out on a dependency graph abstracted from a problem instance, and show that computing MRB is NP-hard. We then prove that under both labeled and unlabeled settings, even for uniform cylindrical objects, the number of required running buffers may grow unbounded as the number of objects to be rearranged increases. We further show that the bound for the unlabeled case is tight. On the algorithmic side, we develop effective exact algorithms for finding MRB for both labeled and unlabeled tabletop rearrangement problems, scalable to over a hundred objects under very high object density. More importantly, our algorithms also compute a sequence witnessing the computed MRB that can be used for solving object rearrangement tasks. Employing these algorithms, empirical evaluations reveal that random labeled and unlabeled instances, which more closely mimics real-world setups, generally have fairly small MRBs. Using real robot experiments, we demonstrate that the running buffer abstraction leads to state-of-the-art solutions for in-place rearrangement of many objects in tight, bounded workspace.


Learned Tree Search for Long-Horizon Social Robot Navigation in Shared Airspace

arXiv.org Artificial Intelligence

The fast-growing demand for fully autonomous aerial operations in shared spaces necessitates developing trustworthy agents that can safely and seamlessly navigate in crowded, dynamic spaces. In this work, we propose Social Robot Tree Search (SoRTS), an algorithm for the safe navigation of mobile robots in social domains. SoRTS aims to augment existing socially-aware trajectory prediction policies with a Monte Carlo Tree Search planner for improved downstream navigation of mobile robots. To evaluate the performance of our method, we choose the use case of social navigation for general aviation. To aid this evaluation, within this work, we also introduce X-PlaneROS, a high-fidelity aerial simulator, to enable more research in full-scale aerial autonomy. By conducting a user study based on the assessments of 26 FAA certified pilots, we show that SoRTS performs comparably to a competent human pilot, significantly outperforming our baseline algorithm. We further complement these results with self-play experiments in scenarios with increasing complexity.


Combinatorial Optimization enriched Machine Learning to solve the Dynamic Vehicle Routing Problem with Time Windows

arXiv.org Machine Learning

With the rise of e-commerce and increasing customer requirements, logistics service providers face a new complexity in their daily planning, mainly due to efficiently handling same day deliveries. Existing multi-stage stochastic optimization approaches that allow to solve the underlying dynamic vehicle routing problem are either computationally too expensive for an application in online settings, or -- in the case of reinforcement learning -- struggle to perform well on high-dimensional combinatorial problems. To mitigate these drawbacks, we propose a novel machine learning pipeline that incorporates a combinatorial optimization layer. We apply this general pipeline to a dynamic vehicle routing problem with dispatching waves, which was recently promoted in the EURO Meets NeurIPS Vehicle Routing Competition at NeurIPS 2022. Our methodology ranked first in this competition, outperforming all other approaches in solving the proposed dynamic vehicle routing problem. With this work, we provide a comprehensive numerical study that further highlights the efficacy and benefits of the proposed pipeline beyond the results achieved in the competition, e.g., by showcasing the robustness of the encoded policy against unseen instances and scenarios.


Significance of Minimax Optimization part1(Machine Learning)

#artificialintelligence

Abstract: In the paper, we study a class of nonconvex nonconcave minimax optimization problems (i.e., minxmaxyf(x,y)), where f(x,y) is possible nonconvex in x, and it is nonconcave and satisfies the Polyak-Lojasiewicz (PL) condition in y. Moreover, we propose a class of enhanced momentum-based gradient descent ascent methods (i.e., MSGDA and AdaMSGDA) to solve these stochastic Nonconvex-PL minimax problems. In particular, our AdaMSGDA algorithm can use various adaptive learning rates in updating the variables x and y without relying on any global and coordinate-wise adaptive learning rates. Theoretically, we present an effective convergence analysis framework for our methods. Specifically, we prove that our MSGDA and AdaMSGDA methods have the best known sample (gradient) complexity of O(ε 3) only requiring one sample at each loop in finding an ε-stationary solution (i.e., E F(x) ε, where F(x) maxyf(x,y)).


Partial Optimality in Cubic Correlation Clustering

arXiv.org Artificial Intelligence

The higher-order correlation clustering problem is an expressive model, and recently, local search heuristics have been proposed for several applications. Certifying optimality, however, is NP-hard and practically hampered already by the complexity of the problem statement. Here, we focus on establishing partial optimality conditions for the special case of complete graphs and cubic objective functions. In addition, we define and implement algorithms for testing these conditions and examine their effect numerically, on two datasets.


Sublinear Convergence Rates of Extragradient-Type Methods: A Survey on Classical and Recent Developments

arXiv.org Machine Learning

The generalized equation (also called the [non]linear inclusion) provides a unified template to model various problems in computational mathematics and related fields su ch as the optimality condition of optimization problems (in both unconstrained and constrained settings), minimax optimization, variational inequality, complementarity, two-person game, and fixed-point problem s, see, e.g., [11, 24, 50, 112, 116, 118, 120]. Theory and numerical methods for this equation and its special case s have been extensively studied for many decades, see, e.g., the following monographs and the references quot ed therein [11, 50, 94, 119]. At the same time, several applications of this mathematical tool in operatio ns research, economics, uncertainty quantification, and transportations have been investigated [14, 52, 61, 50, 72]. In the last few years, there has been a surge of research in minimax problems due to new applications in mach ine learning and robust optimization, especially in generative adversarial networks (GANs), adversarial tr aining, and distributionally robust optimization, see, e.g., [4, 14, 55, 76, 84, 114] as a few examples. Minimax probl ems have also found new applications in online learning and reinforcement learning, among many others, se e, e.g., [4, 9, 15, 55, 67, 76, 78, 84, 114, 139]. Such prominent applications have motivated the research in minimax optimization and variational inequality problems (VIPs). On the one hand, classical algorithms such as gradient descent-ascent, extragradient, and primal-dual methods have been revisited, improved, and ext ended. On the other hand, new variants such as accelerated extragradient and accelerated operator split ting schemes have also been developed and equipped with rigorous convergence guarantees and practical perfor mance evaluation. This new development motivates us to write this survey paper, with the focus on sublinear con vergence rate analysis.


Heuristic Search For Physics-Based Problems: Angry Birds in PDDL+

arXiv.org Artificial Intelligence

This paper studies how a domain-independent planner and combinatorial search can be employed to play Angry Birds, a well established AI challenge problem. To model the game, we use PDDL+, a planning language for mixed discrete/continuous domains that supports durative processes and exogenous events. The paper describes the model and identifies key design decisions that reduce the problem complexity. In addition, we propose several domain-specific enhancements including heuristics and a search technique similar to preferred operators. Together, they alleviate the complexity of combinatorial search. We evaluate our approach by comparing its performance with dedicated domain-specific solvers on a range of Angry Birds levels. The results show that our performance is on par with these domain-specific approaches in most levels, even without using our domain-specific search enhancements.