AITopics

2209.03009

Country:

Asia > Singapore (0.05)
Asia > India (0.05)
North America > Canada (0.04)
(2 more...)

Genre: Research Report (0.50)

Industry: Health & Medicine (0.69)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

arXiv.org Artificial IntelligenceSep-7-2022

A Survey of Neural Trees

Li, Haoling, Song, Jie, Xue, Mengqi, Zhang, Haofei, Ye, Jingwen, Cheng, Lechao, Song, Mingli

Neural networks (NNs) and decision trees (DTs) are both popular models of machine learning, yet coming with mutually exclusive advantages and limitations. To bring the best of the two worlds, a variety of approaches are proposed to integrate NNs and DTs explicitly or implicitly. In this survey, these approaches are organized in a school which we term as neural trees (NTs). This survey aims to present a comprehensive review of NTs and attempts to identify how they enhance the model interpretability. We first propose a thorough taxonomy of NTs that expresses the gradual integration and co-evolution of NNs and DTs. Afterward, we analyze NTs in terms of their interpretability and performance, and suggest possible solutions to the remaining challenges. Finally, this survey concludes with a discussion about other considerations like conditional computation and promising directions towards this field. A list of papers reviewed in this survey, along with their corresponding codes, is available at: https://github.com/zju-vipa/awesome-neural-trees

class hierarchy, ndt, neural network, (15 more...)

2209.03415

Country:

Asia > China (0.05)
Asia > Singapore > Central Region > Singapore (0.04)
North America > United States (0.04)
(3 more...)

Genre: Overview (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Expert Systems (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)
(6 more...)

#artificialintelligenceSep-6-2022, 03:10:06 GMT

How the Adam Optimization technique works(Artificial Intelligence)

Abstract: A common way to train neural networks is the Backpropagation. This algorithm includes a gradient descent method, which needs an adaptive step size. In the area of neural networks, the ADAM-Optimizer is one of the most popular adaptive step size methods. The 5865 citations in only three years shows additionally the importance of the given paper. We discovered that the given convergence proof of the optimizer contains some mistakes, so that the proof will be wrong.

adam optimization technique work, algorithm, neural network, (11 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.37)

Amadio, Fabio, Libera, Alberto Dalla, Antonello, Riccardo, Nikovski, Daniel, Carli, Ruggero, Romeres, Diego

Model-Based Policy Search Using Monte Carlo Gradient Estimation with Real Systems Application

In this paper, we present a Model-Based Reinforcement Learning (MBRL) algorithm named \emph{Monte Carlo Probabilistic Inference for Learning COntrol} (MC-PILCO). The algorithm relies on Gaussian Processes (GPs) to model the system dynamics and on a Monte Carlo approach to estimate the policy gradient. This defines a framework in which we ablate the choice of the following components: (i) the selection of the cost function, (ii) the optimization of policies using dropout, (iii) an improved data efficiency through the use of structured kernels in the GP models. The combination of the aforementioned aspects affects dramatically the performance of MC-PILCO. Numerical comparisons in a simulated cart-pole environment show that MC-PILCO exhibits better data efficiency and control performance w.r.t. state-of-the-art GP-based MBRL algorithms. Finally, we apply MC-PILCO to real systems, considering in particular systems with partially measurable states. We discuss the importance of modeling both the measurement system and the state estimators during policy optimization. The effectiveness of the proposed solutions has been tested in simulation and on two real systems, a Furuta pendulum and a ball-and-plate rig.

artificial intelligence, machine learning, mc-pilco, (18 more...)

doi: 10.1109/TRO.2022.3184837

2101.12115

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > Florida > Broward County > Fort Lauderdale (0.04)
North America > United States > California (0.04)
(2 more...)

Genre:

Research Report > Experimental Study (0.47)
Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.87)
(2 more...)

Gokcesu, Kaan, Gokcesu, Hakan

$1D$ to $nD$: A Meta Algorithm for Multivariate Global Optimization via Univariate Optimizers

In this work, we propose a meta algorithm that can solve a multivariate global optimization problem using univariate global optimizers. Although the univariate global optimization does not receive much attention compared to the multivariate case, which is more emphasized in academia and industry; we show that it is still relevant and can be directly used to solve problems of multivariate optimization. We also provide the corresponding regret bounds in terms of the time horizon $T$ and the average regret of the univariate optimizer, when it is robust against nonnegative noises with robust regret guarantees.

algorithm, evaluation, optimization, (13 more...)

2209.03246

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Russia (0.04)
Asia > Russia (0.04)

Genre:

Instructional Material (0.46)
Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

A Combined Inverse Kinematics Algorithm Using FABRIK with Optimization

Xu, Zichun, Li, Yuntao, Yang, Xiaohang, Zhao, Zhiyuan, Zhao, Jingdong, Liu, Hong

Forward and backward reaching inverse kinematics (FABRIK) is a heuristic inverse kinematics solver that is gradually applied to manipulators with the advantages of fast convergence and generating more realistic configurations. However, under the high error constraint, FABRIK exhibits unstable convergence behavior, which is unsatisfactory for the real-time motion planning of manipulators. In this paper, a novel inverse kinematics algorithm that combines FABRIK and the sequential quadratic programming (SQP) algorithm is presented, in which the joint angles deduced by FABRIK will be taken as the initial seed of the SQP algorithm to avoid getting stuck in local minima. The combined algorithm is evaluated with experiments, in which our algorithm can achieve higher success rates and faster solution times than FABRIK under the high error constraint. Furthermore, the combined algorithm can generate continuous trajectories for the UR5 and KUKA LBR IIWA 14 R820 manipulators in path tracking with no pose error and permitted position error of the end-effector.

algorithm, fabrik, manipulator, (15 more...)

2209.02532

Country:

Asia > China > Heilongjiang Province > Harbin (0.04)
Europe > Denmark > North Jutland > Aalborg (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (0.68)

Achieving Model Fairness in Vertical Federated Learning

Liu, Changxin, Fan, Zhenan, Zhou, Zirui, Shi, Yang, Pei, Jian, Chu, Lingyang, Zhang, Yong

Vertical federated learning (VFL) has attracted greater and greater interest since it enables multiple parties possessing non-overlapping features to strengthen their machine learning models without disclosing their private data and model parameters. Similar to other machine learning algorithms, VFL faces demands and challenges of fairness, i.e., the learned model may be unfairly discriminatory over some groups with sensitive attributes. To tackle this problem, we propose a fair VFL framework in this work. First, we systematically formulate the problem of training fair models in VFL, where the learning task is modelled as a constrained optimization problem. To solve it in a federated and privacy-preserving manner, we consider the equivalent dual form of the problem and develop an asynchronous gradient coordinate-descent ascent algorithm, where some active data parties perform multiple parallelized local updates per communication round to effectively reduce the number of communication rounds. The messages that the server sends to passive parties are deliberately designed such that the information necessary for local updates is released without intruding on the privacy of data and sensitive attributes. We rigorously study the convergence of the algorithm when applied to general nonconvex-concave min-max problems. We prove that the algorithm finds a $\delta$-stationary point of the dual objective in $\mathcal{O}(\delta^{-4})$ communication rounds under mild conditions. Finally, the extensive experiments on three benchmark datasets demonstrate the superior performance of our method in training fair models.

algorithm, fairness, fairvfl, (14 more...)

2109.08344

Country:

North America > United States (0.04)
North America > Canada > Ontario > Hamilton (0.04)
North America > Canada > British Columbia (0.04)

Genre: Research Report (0.40)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)

#artificialintelligenceSep-5-2022, 02:55:18 GMT

How Hill Climbing Algorithm Works(Artificial Intelligence)

Abstract: Neural networks have now long been used for solving complex problems of image domain, yet designing the same needs manual expertise. Furthermore, techniques for automatically generating a suitable deep learning architecture for a given dataset have frequently made use of reinforcement learning and evolutionary methods which take extensive computational resources and time. We propose a new framework for neural architecture search based on a hill-climbing procedure using morphism operators that makes use of a novel gradient update scheme. The update is based on the aging of neural network layers and results in the reduction in the overall training time. This technique can search in a broader search space which subsequently yields competitive results.

algorithm, artificial intelligence, hill climbing algorithm work, (7 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.91)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.76)

Bai, Fang, Bartoli, Adrien

The Proxy Step-size Technique for Regularized Optimization on the Sphere Manifold

arXiv.org Artificial IntelligenceSep-5-2022

We give an effective solution to the regularized optimization problem $g (\boldsymbol{x}) + h (\boldsymbol{x})$, where $\boldsymbol{x}$ is constrained on the unit sphere $\Vert \boldsymbol{x} \Vert_2 = 1$. Here $g (\cdot)$ is a smooth cost with Lipschitz continuous gradient within the unit ball $\{\boldsymbol{x} : \Vert \boldsymbol{x} \Vert_2 \le 1 \}$ whereas $h (\cdot)$ is typically non-smooth but convex and absolutely homogeneous, \textit{e.g.,}~norm regularizers and their combinations. Our solution is based on the Riemannian proximal gradient, using an idea we call \textit{proxy step-size} -- a scalar variable which we prove is monotone with respect to the actual step-size within an interval. The proxy step-size exists ubiquitously for convex and absolutely homogeneous $h(\cdot)$, and decides the actual step-size and the tangent update in closed-form, thus the complete proximal gradient iteration. Based on these insights, we design a Riemannian proximal gradient method using the proxy step-size. We prove that our method converges to a critical point, guided by a line-search technique based on the $g(\cdot)$ cost only. The proposed method can be implemented in a couple of lines of code. We show its usefulness by applying nuclear norm, $\ell_1$ norm, and nuclear-spectral norm regularization to three classical computer vision problems. The improvements are consistent and backed by numerical experiments.

artificial intelligence, iteration, machine learning, (18 more...)

doi: 10.1109/TPAMI.2022.3215914

2209.01812

Country:

Europe > France > Auvergne-Rhône-Alpes > Puy-de-Dôme > Clermont-Ferrand (0.04)
Asia > China (0.04)
Oceania > Australia > New South Wales > Sydney (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.81)

Industry: Health & Medicine > Therapeutic Area (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Schultz, Laura, Auld, Joshua, Sokolov, Vadim

Bayesian Calibration for Activity Based Models

arXiv.org Machine LearningSep-5-2022

Transportation activity-based simulators (ABMs) represent an individual traveler's activity patterns and trips throughout the day by using nested choice models. The generated trips are then simulated in a traffic flow simulator to learn system-level patterns. These behaviorally-realistic models require a high-resolution representation of network flows and, thus, are computationally expensive. The very same flexibility which makes these simulation models appealing, also makes their calibration problems intractable, with the number of simulations required to find an optimal solution growing exponentially as the input dimension increases [90, 70]. As a result, the use of these simulators is currently limited to what-if analysis. This paper focuses on calibrating the static choice model parameters used in activity-based simulators. The goal of calibration is to find values of the simulator's input parameters θ that minimizes the deviance between observed data and simulator's outputs.

artificial intelligence, machine learning, modeling & simulation, (16 more...)

arXiv.org Machine Learning

2203.04414

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.14)
North America > United States > Illinois (0.04)
South America > Argentina > Pampas > Buenos Aires F.D. > Buenos Aires (0.04)
(8 more...)

Genre: Research Report (0.82)

Industry:

Transportation > Infrastructure & Services (0.93)
Transportation > Ground > Road (0.93)
Consumer Products & Services > Travel (0.88)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
(2 more...)