Goto

Collaborating Authors

 efficient optimization


Efficient Optimization of a Permanent Magnet Array for a Stable 2D Trap

Müller, Ann-Sophia, Jeong, Moonkwang, Tian, Jiyuan, Zhang, Meng, Qiu, Tian

arXiv.org Artificial Intelligence

Untethered magnetic manipulation of biomedical millirobots has a high potential for minimally invasive surgical applications. However, it is still challenging to exert high actuation forces on the small robots over a large distance. Permanent magnets offer stronger magnetic torques and forces than electromagnetic coils, however, feedback control is more difficult. As proven by Earnshaw's theorem, it is not possible to achieve a stable magnetic trap in 3D by static permanent magnets. Here, we report a stable 2D magnetic force trap by an array of permanent magnets to control a millirobot. The trap is located in an open space with a tunable distance to the magnet array in the range of 20 - 120mm, which is relevant to human anatomical scales. The design is achieved by a novel GPU-accelerated optimization algorithm that uses mean squared error (MSE) and Adam optimizer to efficiently compute the optimal angles for any number of magnets in the array. The algorithm is verified using numerical simulation and physical experiments with an array of two magnets. A millirobot is successfully trapped and controlled to follow a complex trajectory. The algorithm demonstrates high scalability by optimizing the angles for 100 magnets in under three seconds. Moreover, the optimization workflow can be adapted to optimize a permanent magnet array to achieve the desired force vector fields.


Efficient Optimization for Linear Dynamical Systems with Applications to Clustering and Sparse Coding

Neural Information Processing Systems

Linear Dynamical Systems (LDSs) are fundamental tools for modeling spatio-temporal data in various disciplines. Though rich in modeling, analyzing LDSs is not free of difficulty, mainly because LDSs do not comply with Euclidean geometry and hence conventional learning techniques can not be applied directly. In this paper, we propose an efficient projected gradient descent method to minimize a general form of a loss function and demonstrate how clustering and sparse coding with LDSs can be solved by the proposed method efficiently. To this end, we first derive a novel canonical form for representing the parameters of an LDS, and then show how gradient-descent updates through the projection on the space of LDSs can be achieved dexterously. In contrast to previous studies, our solution avoids any approximation in LDS modeling or during the optimization process. Extensive experiments reveal the superior performance of the proposed method in terms of the convergence and classification accuracy over state-of-the-art techniques.



Efficient Optimization for Sparse Gaussian Process Regression

Neural Information Processing Systems

We propose an efficient discrete optimization algorithm for selecting a subset of training data to induce sparsity for Gaussian process regression. The algorithm estimates this inducing set and the hyperparameters using a single objective, either the marginal likelihood or a variational free energy. The space and time complexity are linear in the training set size, and the algorithm can be applied to large regression problems on discrete or continuous domains. Empirical evaluation shows state-of-art performance in the discrete case and competitive results in the continuous case.


Efficient Optimization for Average Precision SVM

Neural Information Processing Systems

The accuracy of information retrieval systems is often measured using average precision (AP). Given a set of positive (relevant) and negative (non-relevant) samples, the parameters of a retrieval system can be estimated using the AP-SVM framework, which minimizes a regularized convex upper bound on the empirical AP loss. However, the high computational complexity of loss-augmented inference, which is required for learning an AP-SVM, prohibits its use with large training datasets. To alleviate this deficiency, we propose three complementary approaches.


Efficient Optimization with Orthogonality Constraint: a Randomized Riemannian Submanifold Method

Han, Andi, Poirion, Pierre-Louis, Takeda, Akiko

arXiv.org Machine Learning

Optimization with orthogonality constraints frequently arises in various fields such as machine learning. Riemannian optimization offers a powerful framework for solving these problems by equipping the constraint set with a Riemannian manifold structure and performing optimization intrinsically on the manifold. This approach typically involves computing a search direction in the tangent space and updating variables via a retraction operation. However, as the size of the variables increases, the computational cost of the retraction can become prohibitively high, limiting the applicability of Riemannian optimization to large-scale problems. To address this challenge and enhance scalability, we propose a novel approach that restricts each update on a random submanifold, thereby significantly reducing the per-iteration complexity. We introduce two sampling strategies for selecting the random submanifolds and theoretically analyze the convergence of the proposed methods. We provide convergence results for general nonconvex functions and functions that satisfy Riemannian Polyak-Lojasiewicz condition as well as for stochastic optimization settings. Additionally, we demonstrate how our approach can be generalized to quotient manifolds derived from the orthogonal manifold. Extensive experiments verify the benefits of the proposed method, across a wide variety of problems.



GNN-DT: Graph Neural Network Enhanced Decision Transformer for Efficient Optimization in Dynamic Environments

Orfanoudakis, Stavros, Panda, Nanda Kishor, Palensky, Peter, Vergara, Pedro P.

arXiv.org Artificial Intelligence

Reinforcement Learning (RL) methods used for solving real-world optimization problems often involve dynamic state-action spaces, larger scale, and sparse rewards, leading to significant challenges in convergence, scalability, and efficient exploration of the solution space. This study introduces GNN-DT, a novel Decision Transformer (DT) architecture that integrates Graph Neural Network (GNN) embedders with a novel residual connection between input and output tokens crucial for handling dynamic environments. By learning from previously collected trajectories, GNN-DT reduces dependence on accurate simulators and tackles the sparse rewards limitations of online RL algorithms. We evaluate GNN-DT on the complex electric vehicle (EV) charging optimization problem and prove that its performance is superior and requires significantly fewer training trajectories, thus improving sample efficiency compared to existing DT baselines. Furthermore, GNN-DT exhibits robust generalization to unseen environments and larger action spaces, addressing a critical gap in prior DT-based approaches


Reviews: Globally Optimal Learning for Structured Elliptical Losses

Neural Information Processing Systems

Lemma 1) is incrementally built on previous work. Based on related work, it's hard to contextualize their work in the existing literature. The readers may have the following important questions. A clear explanation is needed for this. This work is significant in that they provide optimality proof that leads to more efficient optimization method for a wide range of robust elliptical losses including Gaussian, Generalized Gaussian, Huber, etc.


Reviews: Efficient Optimization for Linear Dynamical Systems with Applications to Clustering and Sparse Coding

Neural Information Processing Systems

Summary: the paper proposes a clustering method for linear dynamical systems, based on minimizing the kernalized norm between extended observability spaces. Since the objective function contains terms involving discrete Lypanunov equations (DLEs), the paper derives how to pass the derivatives through, yielding a gradient descent algorithm. Experiments are presented on 2 datasets, for LDS clustering and sparse codebook learning. Originality: 1) novelty: the paper is moderately novel, but fairly straightforward. The difference with other related works [3,9,10,11,12] in terms of objective functions, assumptions/constraints, canonical forms should be discussed more.