Goto

Collaborating Authors

 Optimization


Evolutionary Algorithm and Multifactorial Evolutionary Algorithm on Clustered Shortest-Path Tree problem

arXiv.org Artificial Intelligence

In literature, Clustered Shortest-Path Tree Problem (CluSPT) is an NP-hard problem. Previous studies often search for an optimal solution in relatively large space. To enhance the performance of the search process, two approaches are proposed: the first approach seeks for solutions as a set of edges. From the original graph, we generate a new graph whose vertex set's cardinality is much smaller than that of the original one. Consequently, an effective Evolutionary Algorithm (EA) is proposed for solving CluSPT. The second approach looks for vertex-based solutions. The search space of the CluSPT is transformed into 2 nested search spaces (NSS). With every candidate in the high-level optimization, the search engine in the lower level will find a corresponding candidate to combine with it to create the best solution for CluSPT. Accordingly, Nested Local Search EA (N-LSEA) is introduced to search for the optimal solution on the NSS. When solving this model in lower level by N-LSEA, variety of similar tasks are handled. Thus, Multifactorial Evolutionary Algorithm applied in order to enhance the implicit genetic transfer across these optimizations. Proposed algorithms are conducted on a series of datasets and the obtained results demonstrate superior efficiency in comparison to previous scientific works.


Concentration of solutions to random equations with concentration of measure hypotheses

arXiv.org Machine Learning

We propose here to study the concentration of random objects that are implicitly formulated as fixed points to equations $Y = f(X)$ where $f$ is a random mapping. Starting from an hypothesis taken from the concentration of the measure theory, we are able to express precisely the concentration of such solutions, under some contractivity hypothesis on $f$. This statement has important implication to random matrix theory, and is at the basis of the study of some optimization procedures like the logistic regression for instance. In those last cases, we give precise estimations to the first statistics of the solution $Y$ which allows us predict the performances of the algorithm.


Learning to solve TV regularized problems with unrolled algorithms

arXiv.org Machine Learning

Total Variation (TV) is a popular regularization strategy that promotes piece-wise constant signals by constraining the $\ell_1$-norm of the first order derivative of the estimated signal. The resulting optimization problem is usually solved using iterative algorithms such as proximal gradient descent, primal-dual algorithms or ADMM. However, such methods can require a very large number of iterations to converge to a suitable solution. In this paper, we accelerate such iterative algorithms by unfolding proximal gradient descent solvers in order to learn their parameters for 1D TV regularized problems. While this could be done using the synthesis formulation, we demonstrate that this leads to slower performances. The main difficulty in applying such methods in the analysis formulation lies in proposing a way to compute the derivatives through the proximal operator. As our main contribution, we develop and characterize two approaches to do so, describe their benefits and limitations, and discuss the regime where they can actually improve over iterative procedures. We validate those findings with experiments on synthetic and real data.


Factorization Machines with Regularization for Sparse Feature Interactions

arXiv.org Machine Learning

Factorization machines (FMs) are machine learning predictive models based on second-order feature interactions and FMs with sparse regularization are called sparse FMs. Such regularizations enable feature selection, which selects the most relevant features for accurate prediction, and therefore they can contribute to the improvement of the model accuracy and interpretability. However, because FMs use second-order feature interactions, the selection of features often cause the loss of many relevant feature interactions in the resultant models. In such cases, FMs with regularization specially designed for feature interaction selection trying to achieve interaction-level sparsity may be preferred instead of those just for feature selection trying to achieve feature-level sparsity. In this paper, we present a new regularization scheme for feature interaction selection in FMs. The proposed regularizer is an upper bound of the $\ell_1$ regularizer for the feature interaction matrix, which is computed from the parameter matrix of FMs. For feature interaction selection, our proposed regularizer makes the feature interaction matrix sparse without a restriction on sparsity patterns imposed by the existing methods. We also describe efficient proximal algorithms for the proposed FMs and present theoretical analyses of both existing and the new regularize. In addition, we will discuss how our ideas can be applied or extended to more accurate feature selection and other related models such as higher-order FMs and the all-subsets model. The analysis and experimental results on synthetic and real-world datasets show the effectiveness of the proposed methods.


Robot Design With Neural Networks, MILP Solvers and Active Learning

arXiv.org Artificial Intelligence

Central to the design of many robot systems and their controllers is solving a constrained blackbox optimization problem. This paper presents CNMA, a new method of solving this problem that is conservative in the number of potentially expensive blackbox function evaluations; allows specifying complex, even recursive constraints directly rather than as hard-to-design penalty or barrier functions; and is resilient to the non-termination of function evaluations. CNMA leverages the ability of neural networks to approximate any continuous function, their transformation into equivalent mixed integer linear programs (MILPs) and their optimization subject to constraints with industrial strength MILP solvers. A new learning-from-failure step guides the learning to be relevant to solving the constrained optimization problem. Thus, the amount of learning is orders of magnitude smaller than that needed to learn functions over their entire domains. CNMA is illustrated with the design of several robotic systems: wave-energy propelled boat, lunar lander, hexapod, cartpole, acrobot and parallel parking. These range from 6 real-valued dimensions to 36. We show that CNMA surpasses the Nelder-Mead, Gaussian and Random Search optimization methods against the metric of number of function evaluations.


5G-and-Beyond Networks with UAVs: From Communications to Sensing and Intelligence

arXiv.org Artificial Intelligence

Due to the advancements in cellular technologies and the dense deployment of cellular infrastructure, integrating unmanned aerial vehicles (UAVs) into the fifth-generation (5G) and beyond cellular networks is a promising solution to achieve safe UAV operation as well as enabling diversified applications with mission-specific payload data delivery. In particular, 5G networks need to support three typical usage scenarios, namely, enhanced mobile broadband (eMBB), ultra-reliable low-latency communications (URLLC), and massive machine-type communications (mMTC). On the one hand, UAVs can be leveraged as cost-effective aerial platforms to provide ground users with enhanced communication services by exploiting their high cruising altitude and controllable maneuverability in three-dimensional (3D) space. On the other hand, providing such communication services simultaneously for both UAV and ground users poses new challenges due to the need for ubiquitous 3D signal coverage as well as the strong air-ground network interference. Besides the requirement of high-performance wireless communications, the ability to support effective and efficient sensing as well as network intelligence is also essential for 5G-and-beyond 3D heterogeneous wireless networks with coexisting aerial and ground users. In this paper, we provide a comprehensive overview of the latest research efforts on integrating UAVs into cellular networks, with an emphasis on how to exploit advanced techniques (e.g., intelligent reflecting surface, short packet transmission, energy harvesting, joint communication and radar sensing, and edge intelligence) to meet the diversified service requirements of next-generation wireless systems. Moreover, we highlight important directions for further investigation in future work.


Improved Algorithms for Convex-Concave Minimax Optimization

arXiv.org Machine Learning

This paper studies minimax optimization problems $\min_x \max_y f(x,y)$, where $f(x,y)$ is $m_x$-strongly convex with respect to $x$, $m_y$-strongly concave with respect to $y$ and $(L_x,L_{xy},L_y)$-smooth. Zhang et al. provided the following lower bound of the gradient complexity for any first-order method: $\Omega\Bigl(\sqrt{\frac{L_x}{m_x}+\frac{L_{xy}^2}{m_x m_y}+\frac{L_y}{m_y}}\ln(1/\epsilon)\Bigr).$ This paper proposes a new algorithm with gradient complexity upper bound $\tilde{O}\Bigl(\sqrt{\frac{L_x}{m_x}+\frac{L\cdot L_{xy}}{m_x m_y}+\frac{L_y}{m_y}}\ln\left(1/\epsilon\right)\Bigr),$ where $L=\max\{L_x,L_{xy},L_y\}$. This improves over the best known upper bound $\tilde{O}\left(\sqrt{\frac{L^2}{m_x m_y}} \ln^3\left(1/\epsilon\right)\right)$ by Lin et al. Our bound achieves linear convergence rate and tighter dependency on condition numbers, especially when $L_{xy}\ll L$ (i.e., when the interaction between $x$ and $y$ is weak). Via reduction, our new bound also implies improved bounds for strongly convex-concave and convex-concave minimax optimization problems. When $f$ is quadratic, we can further improve the upper bound, which matches the lower bound up to a small sub-polynomial factor.


DAGs with No Fears: A Closer Look at Continuous Optimization for Learning Bayesian Networks

arXiv.org Machine Learning

This paper re-examines a continuous optimization framework dubbed NOTEARS for learning Bayesian networks. We first generalize existing algebraic characterizations of acyclicity to a class of matrix polynomials. Next, focusing on a one-parameter-per-edge setting, it is shown that the Karush-Kuhn-Tucker (KKT) optimality conditions for the NOTEARS formulation cannot be satisfied except in a trivial case, which explains a behavior of the associated algorithm. We then derive the KKT conditions for an equivalent reformulation, show that they are indeed necessary, and relate them to explicit constraints that certain edges be absent from the graph. If the score function is convex, these KKT conditions are also sufficient for local minimality despite the non-convexity of the constraint. Informed by the KKT conditions, a local search post-processing algorithm is proposed and shown to substantially and universally improve the structural Hamming distance of all tested algorithms, typically by a factor of 2 or more. Some combinations with local search are both more accurate and more efficient than the original NOTEARS.


Low-Rank and Sparse Enhanced Tucker Decomposition for Tensor Completion

arXiv.org Machine Learning

Tensor completion refers to the task of estimating the missing data from an incomplete measurement or observation, which is a core problem frequently arising from the areas of big data analysis, computer vision, and network engineering. Due to the multidimensional nature of high-order tensors, the matrix approaches, e.g., matrix factorization and direct matricization of tensors, are often not ideal for tensor completion and recovery. Exploiting the potential periodicity and inherent correlation properties appeared in real-world tensor data, in this paper, we shall incorporate the low-rank and sparse regularization technique to enhance Tucker decomposition for tensor completion. A series of computational experiments on real-world datasets, including internet traffic data, color images, and face recognition, show that our model performs better than many existing state-of-the-art matricization and tensorization approaches in terms of achieving higher recovery accuracy. Naturally, these data would be stored as where rank(F) represents the rank of the underlying matrix higher-order tensor (a.k.a., multidimensional array), which F and M R Following the spirit of matrix completion model [1], [2], [3], [4], to name just a few.


The bi-objective multimodal car-sharing problem

arXiv.org Artificial Intelligence

The aim of the bi-objective multimodal car-sharing problem (BiO-MMCP) is to determine the optimal mode of transport assignment for trips and to schedule the routes of available cars and users whilst minimizing cost and maximizing user satisfaction. We investigate the BiO-MMCP from a user-centred point of view. As user satisfaction is a crucial aspect in shared mobility systems, we consider user preferences in a second objective. Users may choose and rank their preferred modes of transport for different times of the day. In this way we account for, e.g., different traffic conditions throughout the planning horizon. We study different variants of the problem. In the base problem, the sequence of tasks a user has to fulfill is fixed in advance and travel times as well as preferences are constant over the planning horizon. In variant 2, time-dependent travel times and preferences are introduced. In variant 3, we examine the challenges when allowing additional routing decisions. Variant 4 integrates variants 2 and 3. For this last variant, we develop a branch-and-cut algorithm which is embedded in two bi-objective frameworks, namely the $\epsilon$-constraint method and a weighting binary search method. Computational experiments show that the branch-and cut algorithm outperforms the MIP formulation and we discuss changing solutions along the Pareto frontier.