Optimization
Biblio-Analysis of Cohort Intelligence (CI) Algorithm and its allied applications from Scopus and Web of Science Perspective
Kale, Ishaan, Joshi, Rahul, Kadam, Kalyani
Cohort Intelligence or CI is one of its kind of novel optimization algorithm. Since its inception, in a very short span it is applied successfully in various domains and its results are observed to be effectual in contrast to algorithm of its kind. Till date, there is no such type of bibliometric analysis carried out on CI and its related applications. So, this research paper in a way will be an ice breaker for those who want to take up CI to a new level. In this research papers, CI publications available in Scopus are analyzed through graphs, networked diagrams about authors, source titles, keywords over the years, journals over the time. In a way this bibliometric paper showcase CI, its applications and detail outs systematic review in terms its bibliometric details.
A Survey of Neural Trees
Li, Haoling, Song, Jie, Xue, Mengqi, Zhang, Haofei, Ye, Jingwen, Cheng, Lechao, Song, Mingli
Neural networks (NNs) and decision trees (DTs) are both popular models of machine learning, yet coming with mutually exclusive advantages and limitations. To bring the best of the two worlds, a variety of approaches are proposed to integrate NNs and DTs explicitly or implicitly. In this survey, these approaches are organized in a school which we term as neural trees (NTs). This survey aims to present a comprehensive review of NTs and attempts to identify how they enhance the model interpretability. We first propose a thorough taxonomy of NTs that expresses the gradual integration and co-evolution of NNs and DTs. Afterward, we analyze NTs in terms of their interpretability and performance, and suggest possible solutions to the remaining challenges. Finally, this survey concludes with a discussion about other considerations like conditional computation and promising directions towards this field. A list of papers reviewed in this survey, along with their corresponding codes, is available at: https://github.com/zju-vipa/awesome-neural-trees
How the Adam Optimization technique works(Artificial Intelligence)
Abstract: A common way to train neural networks is the Backpropagation. This algorithm includes a gradient descent method, which needs an adaptive step size. In the area of neural networks, the ADAM-Optimizer is one of the most popular adaptive step size methods. The 5865 citations in only three years shows additionally the importance of the given paper. We discovered that the given convergence proof of the optimizer contains some mistakes, so that the proof will be wrong.
Model-Based Policy Search Using Monte Carlo Gradient Estimation with Real Systems Application
Amadio, Fabio, Libera, Alberto Dalla, Antonello, Riccardo, Nikovski, Daniel, Carli, Ruggero, Romeres, Diego
In this paper, we present a Model-Based Reinforcement Learning (MBRL) algorithm named \emph{Monte Carlo Probabilistic Inference for Learning COntrol} (MC-PILCO). The algorithm relies on Gaussian Processes (GPs) to model the system dynamics and on a Monte Carlo approach to estimate the policy gradient. This defines a framework in which we ablate the choice of the following components: (i) the selection of the cost function, (ii) the optimization of policies using dropout, (iii) an improved data efficiency through the use of structured kernels in the GP models. The combination of the aforementioned aspects affects dramatically the performance of MC-PILCO. Numerical comparisons in a simulated cart-pole environment show that MC-PILCO exhibits better data efficiency and control performance w.r.t. state-of-the-art GP-based MBRL algorithms. Finally, we apply MC-PILCO to real systems, considering in particular systems with partially measurable states. We discuss the importance of modeling both the measurement system and the state estimators during policy optimization. The effectiveness of the proposed solutions has been tested in simulation and on two real systems, a Furuta pendulum and a ball-and-plate rig.
$1D$ to $nD$: A Meta Algorithm for Multivariate Global Optimization via Univariate Optimizers
In this work, we propose a meta algorithm that can solve a multivariate global optimization problem using univariate global optimizers. Although the univariate global optimization does not receive much attention compared to the multivariate case, which is more emphasized in academia and industry; we show that it is still relevant and can be directly used to solve problems of multivariate optimization. We also provide the corresponding regret bounds in terms of the time horizon $T$ and the average regret of the univariate optimizer, when it is robust against nonnegative noises with robust regret guarantees.
A Combined Inverse Kinematics Algorithm Using FABRIK with Optimization
Xu, Zichun, Li, Yuntao, Yang, Xiaohang, Zhao, Zhiyuan, Zhao, Jingdong, Liu, Hong
Forward and backward reaching inverse kinematics (FABRIK) is a heuristic inverse kinematics solver that is gradually applied to manipulators with the advantages of fast convergence and generating more realistic configurations. However, under the high error constraint, FABRIK exhibits unstable convergence behavior, which is unsatisfactory for the real-time motion planning of manipulators. In this paper, a novel inverse kinematics algorithm that combines FABRIK and the sequential quadratic programming (SQP) algorithm is presented, in which the joint angles deduced by FABRIK will be taken as the initial seed of the SQP algorithm to avoid getting stuck in local minima. The combined algorithm is evaluated with experiments, in which our algorithm can achieve higher success rates and faster solution times than FABRIK under the high error constraint. Furthermore, the combined algorithm can generate continuous trajectories for the UR5 and KUKA LBR IIWA 14 R820 manipulators in path tracking with no pose error and permitted position error of the end-effector.
Achieving Model Fairness in Vertical Federated Learning
Liu, Changxin, Fan, Zhenan, Zhou, Zirui, Shi, Yang, Pei, Jian, Chu, Lingyang, Zhang, Yong
Vertical federated learning (VFL) has attracted greater and greater interest since it enables multiple parties possessing non-overlapping features to strengthen their machine learning models without disclosing their private data and model parameters. Similar to other machine learning algorithms, VFL faces demands and challenges of fairness, i.e., the learned model may be unfairly discriminatory over some groups with sensitive attributes. To tackle this problem, we propose a fair VFL framework in this work. First, we systematically formulate the problem of training fair models in VFL, where the learning task is modelled as a constrained optimization problem. To solve it in a federated and privacy-preserving manner, we consider the equivalent dual form of the problem and develop an asynchronous gradient coordinate-descent ascent algorithm, where some active data parties perform multiple parallelized local updates per communication round to effectively reduce the number of communication rounds. The messages that the server sends to passive parties are deliberately designed such that the information necessary for local updates is released without intruding on the privacy of data and sensitive attributes. We rigorously study the convergence of the algorithm when applied to general nonconvex-concave min-max problems. We prove that the algorithm finds a $\delta$-stationary point of the dual objective in $\mathcal{O}(\delta^{-4})$ communication rounds under mild conditions. Finally, the extensive experiments on three benchmark datasets demonstrate the superior performance of our method in training fair models.
How Hill Climbing Algorithm Works(Artificial Intelligence)
Abstract: Neural networks have now long been used for solving complex problems of image domain, yet designing the same needs manual expertise. Furthermore, techniques for automatically generating a suitable deep learning architecture for a given dataset have frequently made use of reinforcement learning and evolutionary methods which take extensive computational resources and time. We propose a new framework for neural architecture search based on a hill-climbing procedure using morphism operators that makes use of a novel gradient update scheme. The update is based on the aging of neural network layers and results in the reduction in the overall training time. This technique can search in a broader search space which subsequently yields competitive results.
The Proxy Step-size Technique for Regularized Optimization on the Sphere Manifold
We give an effective solution to the regularized optimization problem $g (\boldsymbol{x}) + h (\boldsymbol{x})$, where $\boldsymbol{x}$ is constrained on the unit sphere $\Vert \boldsymbol{x} \Vert_2 = 1$. Here $g (\cdot)$ is a smooth cost with Lipschitz continuous gradient within the unit ball $\{\boldsymbol{x} : \Vert \boldsymbol{x} \Vert_2 \le 1 \}$ whereas $h (\cdot)$ is typically non-smooth but convex and absolutely homogeneous, \textit{e.g.,}~norm regularizers and their combinations. Our solution is based on the Riemannian proximal gradient, using an idea we call \textit{proxy step-size} -- a scalar variable which we prove is monotone with respect to the actual step-size within an interval. The proxy step-size exists ubiquitously for convex and absolutely homogeneous $h(\cdot)$, and decides the actual step-size and the tangent update in closed-form, thus the complete proximal gradient iteration. Based on these insights, we design a Riemannian proximal gradient method using the proxy step-size. We prove that our method converges to a critical point, guided by a line-search technique based on the $g(\cdot)$ cost only. The proposed method can be implemented in a couple of lines of code. We show its usefulness by applying nuclear norm, $\ell_1$ norm, and nuclear-spectral norm regularization to three classical computer vision problems. The improvements are consistent and backed by numerical experiments.
Bayesian Calibration for Activity Based Models
Schultz, Laura, Auld, Joshua, Sokolov, Vadim
Transportation activity-based simulators (ABMs) represent an individual traveler's activity patterns and trips throughout the day by using nested choice models. The generated trips are then simulated in a traffic flow simulator to learn system-level patterns. These behaviorally-realistic models require a high-resolution representation of network flows and, thus, are computationally expensive. The very same flexibility which makes these simulation models appealing, also makes their calibration problems intractable, with the number of simulations required to find an optimal solution growing exponentially as the input dimension increases [90, 70]. As a result, the use of these simulators is currently limited to what-if analysis. This paper focuses on calibrating the static choice model parameters used in activity-based simulators. The goal of calibration is to find values of the simulator's input parameters θ that minimizes the deviance between observed data and simulator's outputs.