Optimization
A Hybrid Learning and Optimization Framework to Achieve Physically Interactive Tasks with Mobile Manipulators
Zhao, Jianzhuang, Giammarino, Alberto, Lamon, Edoardo, Gandarias, Juan M., De Momi, Elena, Ajoudani, Arash
This paper proposes a hybrid learning and optimization framework for mobile manipulators for complex and physically interactive tasks. The framework exploits an admittance-type physical interface to obtain intuitive and simplified human demonstrations and Gaussian Mixture Model (GMM)/Gaussian Mixture Regression (GMR) to encode and generate the learned task requirements in terms of position, velocity, and force profiles. Next, using the desired trajectories and force profiles generated by GMM/GMR, the impedance parameters of a Cartesian impedance controller are optimized online through a Quadratic Program augmented with an energy tank to ensure the passivity of the controlled system. Two experiments are conducted to validate the framework, comparing our method with two approaches with constant stiffness (high and low). The results showed that the proposed method outperforms the other two cases in terms of trajectory tracking and generated interaction forces, even in the presence of disturbances such as unexpected end-effector collisions.
Intelligent decision-making method of TBM operating parameters based on multiple constraints and objective optimization
Liu, Bin, Wang, Jiwen, Wang, Ruirui, Wang, Yaxu, Zhao, Guangzu
The decision-making of TBM operating parameters has an important guiding significance for TBM safe and efficient construction, and it has been one of the research hotpots in the field of TBM tunneling. For this purpose, this paper introduces rock-breaking rules into machine learning method, and a rock-machine mapping dual-driven by physical-rule and data-mining is established with high accuracy. This dual-driven mappings are subsequently used as objective function and constraints to build a decision-making method for TBM operating parameters. By searching the revolution per minute and penetration corresponding to the extremum of the objective function subject to the constraints, the optimal operating parameters can be obtained. This method is verified in the field of the Second Water Source Channel of Hangzhou, China, resulting in the average penetration rate increased by 11.3%, and the total cost decreased by 10.0%, which proves the practicability and effectiveness of the developed decision-making model.
Sampling, Communication, and Prediction Co-Design for Synchronizing the Real-World Device and Digital Model in Metaverse
Meng, Zhen, She, Changyang, Zhao, Guodong, De Martini, Daniele
The metaverse has the potential to revolutionize the next generation of the Internet by supporting highly interactive services with the help of Mixed Reality (MR) technologies; still, to provide a satisfactory experience for users, the synchronization between the physical world and its digital models is crucial. This work proposes a sampling, communication and prediction co-design framework to minimize the communication load subject to a constraint on tracking the Mean Squared Error (MSE) between a real-world device and its digital model in the metaverse. To optimize the sampling rate and the prediction horizon, we exploit expert knowledge and develop a constrained Deep Reinforcement Learning (DRL) algorithm, named Knowledge-assisted Constrained Twin-Delayed Deep Deterministic (KC-TD3) policy gradient algorithm. We validate our framework on a prototype composed of a real-world robotic arm and its digital model. Compared with existing approaches: (1) When the tracking error constraint is stringent (MSE=0.002 degrees), our policy degenerates into the policy in the sampling-communication co-design framework. (2) When the tracking error constraint is mild (MSE=0.007 degrees), our policy degenerates into the policy in the prediction-communication co-design framework. (3) Our framework achieves a better trade-off between the average MSE and the average communication load compared with a communication system without sampling and prediction. For example, the average communication load can be reduced up to 87% when the track error constraint is 0.002 degrees. (4) Our policy outperforms the benchmark with the static sampling rate and prediction horizon optimized by exhaustive search, in terms of the tail probability of the tracking error. Furthermore, with the assistance of expert knowledge, the proposed algorithm KC-TD3 achieves better convergence time, stability, and final policy performance.
Multi-Modal Multi-Agent Optimization for LIMMS, A Modular Robotics Approach to Delivery Automation
Lin, Xuan, Fernandez, Gabriel, Liu, Yeting, Zhu, Taoyuanmin, Shirai, Yuki, Hong, Dennis
Abstract-- In this paper we present a motion planner for LIMMS, a modular multi-agent, multi-modal package delivery platform. A single LIMMS unit is a robot that can operate as an arm or leg depending on how and what it is attached to, e.g., a manipulator when it is anchored to walls within a delivery vehicle or a quadruped robot when 4 are attached to a box. Coordinating amongst multiple LIMMS, when each one can take on vastly different roles, can quickly become complex. The formulation is then solved for skill exploration and can be implemented on hardware after refinement. To solve this optimization problem we use alternating direction method of multipliers (ADMM). The proposed planner is experimented under various scenarios which shows the capability of LIMMS to enter into different modes or combinations of them to achieve their goal of moving shipping boxes.
Formal guarantees for heuristic optimization algorithms used in machine learning
Recently, Stochastic Gradient Descent (SGD) and its variants have become the dominant methods in the large-scale optimization of machine learning (ML) problems. A variety of strategies have been proposed for tuning the step sizes, ranging from adaptive step sizes to heuristic methods to change the step size in each iteration. Also, momentum has been widely employed in ML tasks to accelerate the training process. Yet, there is a gap in our theoretical understanding of them. In this work, we start to close this gap by providing formal guarantees to a few heuristic optimization methods and proposing improved algorithms. First, we analyze a generalized version of the AdaGrad (Delayed AdaGrad) step sizes in both convex and non-convex settings, showing that these step sizes allow the algorithms to automatically adapt to the level of noise of the stochastic gradients. We show for the first time sufficient conditions for Delayed AdaGrad to achieve almost sure convergence of the gradients to zero. Moreover, we present a high probability analysis for Delayed AdaGrad and its momentum variant in the non-convex setting. Second, we analyze SGD with exponential and cosine step sizes, which are empirically successful but lack theoretical support. We provide the very first convergence guarantees for them in the smooth and non-convex setting, with and without the Polyak-{\L}ojasiewicz (PL) condition. We also show their good property of adaptivity to noise under the PL condition. Third, we study the last iterate of momentum methods. We prove the first lower bound in the convex setting for the last iterate of SGD with constant momentum. Moreover, we investigate a class of Follow-The-Regularized-Leader-based momentum algorithms with increasing momentum and shrinking updates. We show that their last iterate has optimal convergence for unconstrained convex stochastic optimization problems.
Path-Tree Optimization in Discrete Partially Observable Environments using Rapidly-Exploring Belief-Space Graphs
Phiquepal, Camille, Orthey, Andreas, Viennot, Nicolas, Toussaint, Marc
Robots often need to solve path planning problems where essential and discrete aspects of the environment are partially observable. This introduces a multi-modality, where the robot must be able to observe and infer the state of its environment. To tackle this problem, we introduce the Path-Tree Optimization (PTO) algorithm which plans a path-tree in belief-space. A path-tree is a tree-like motion with branching points where the robot receives an observation leading to a belief-state update. The robot takes different branches depending on the observation received. The algorithm has three main steps. First, a rapidly-exploring random graph (RRG) on the state space is grown. Second, the RRG is expanded to a belief-space graph by querying the observation model. In a third step, dynamic programming is performed on the belief-space graph to extract a path-tree. The resulting path-tree combines exploration with exploitation i.e. it balances the need for gaining knowledge about the environment with the need for reaching the goal. We demonstrate the algorithm capabilities on navigation and mobile manipulation tasks, and show its advantage over a baseline using a task and motion planning approach (TAMP) both in terms of optimality and runtime.
Bipedal Locomotion Optimization by Exploitation of the Full Dynamics in DCM Trajectory Planning
Vedadi, Amirhosein, Sinaei, Kasra, Abdolahnezhad, Pezhman, Aboumasoudi, Shahriar Sheikh, Yousefi-Koma, Aghil
Walking motion planning based on Divergent Component of Motion (DCM) and Linear Inverted Pendulum Model (LIPM) is one of the alternatives that could be implemented to generate online humanoid robot gait trajectories. This algorithm requires different parameters to be adjusted. Herein, we developed a framework to attain optimal parameters to achieve a stable and energy-efficient trajectory for real robot's gait. To find the optimal trajectory, four cost functions representing energy consumption, the sum of joints velocity and applied torque at each lower limb joint of the robot, and a cost function based on the Zero Moment Point (ZMP) stability criterion were considered. Genetic algorithm was employed in the framework to optimize each of these cost functions. Although the trajectory planning was done with the help of the simplified model, the values of each cost function were obtained by considering the full dynamics model and foot-ground contact model in Bullet physics engine simulator. The results of this optimization yield that walking with the most stability and walking in the most efficient way are in contrast with each other. Therefore, in another attempt, multi-objective optimization for ZMP and energy cost functions at three different speeds was performed. Finally, we compared the designed trajectory, which was generated using optimal parameters, with the simulation results in Choreonoid simulator.
Learning to Use Chopsticks in Diverse Gripping Styles
Yang, Zeshi, Yin, KangKang, Liu, Libin
Learning dexterous manipulation skills is a long-standing challenge in computer graphics and robotics, especially when the task involves complex and delicate interactions between the hands, tools and objects. In this paper, we focus on chopsticks-based object relocation tasks, which are common yet demanding. The key to successful chopsticks skills is steady gripping of the sticks that also supports delicate maneuvers. We automatically discover physically valid chopsticks holding poses by Bayesian Optimization (BO) and Deep Reinforcement Learning (DRL), which works for multiple gripping styles and hand morphologies without the need of example data. Given as input the discovered gripping poses and desired objects to be moved, we build physics-based hand controllers to accomplish relocation tasks in two stages. First, kinematic trajectories are synthesized for the chopsticks and hand in a motion planning stage. The key components of our motion planner include a grasping model to select suitable chopsticks configurations for grasping the object, and a trajectory optimization module to generate collision-free chopsticks trajectories. Then we train physics-based hand controllers through DRL again to track the desired kinematic trajectories produced by the motion planner. We demonstrate the capabilities of our framework by relocating objects of various shapes and sizes, in diverse gripping styles and holding positions for multiple hand morphologies. Our system achieves faster learning speed and better control robustness, when compared to vanilla systems that attempt to learn chopstick-based skills without a gripping pose optimization module and/or without a kinematic motion planner.
PUSH: a primal heuristic based on Feasibility PUmp and SHifting
Grani, Giorgio, Coppola, Corrado, Agasucci, Valerio
Since MIP linear problems include both continuous and integer variables, they are proved to belong to the NP-hard class (see [38] for a more detailed analysis), meaning that they are not solvable in polynomial time. The complete exploration of the integer feasible set, whose cardinality grows exponentially with the number of variables, is yet possible to achieve the optimal solution, but for most of the practically significant instances, it would require unacceptable computational effort. In fact, the only way to solve to optimality any mixed-integer problem is to apply some of the well-known Branch and Bound techniques. However, despite combinatorial optimization community provided a great deal of these algorithms, for which the reader should refer to [31, 34, 16], MIP problems complexity is inherent with their belonging to NP-hard class. Therefore, when tackling MIP problems, one either seeks particular structures allowing to bring down the complexity, such as the availability, for a given class of problems, of the optimal formulation or exploits cutting plane generation to dramatically reduce the feasible region dimension. However, we often encounter MIP problems without having any prior knowledge of possible structures and, thus, pursuing the globally optimal solution could be in practice impossible or inefficient, since for our purpose a sub-optimal approximation is considered to be good enough. This makes heuristics one of the most widespread and feasible ways to achieve sub-optimal solutions of MIP problems within an affordable computational time. For the purpose of highlighting the perspective of our research, we can define two classes of MIP heuristics: improvement heuristics and start heuristics.
A Collection of Quality Diversity Optimization Problems Derived from Hyperparameter Optimization of Machine Learning Models
Schneider, Lennart, Pfisterer, Florian, Thomas, Janek, Bischl, Bernd
The goal of Quality Diversity Optimization is to generate a collection of diverse yet high-performing solutions to a given problem at hand. Typical benchmark problems are, for example, finding a repertoire of robot arm configurations or a collection of game playing strategies. In this paper, we propose a set of Quality Diversity Optimization problems that tackle hyperparameter optimization of machine learning models - a so far underexplored application of Quality Diversity Optimization. Our benchmark problems involve novel feature functions, such as interpretability or resource usage of models. To allow for fast and efficient benchmarking, we build upon YAHPO Gym, a recently proposed open source benchmarking suite for hyperparameter optimization that makes use of high performing surrogate models and returns these surrogate model predictions instead of evaluating the true expensive black box function. We present results of an initial experimental study comparing different Quality Diversity optimizers on our benchmark problems. Furthermore, we discuss future directions and challenges of Quality Diversity Optimization in the context of hyperparameter optimization.