Goto

Collaborating Authors

 Rajawat, Ketan


UNet: A Generic and Reliable Multi-UAV Communication and Networking Architecture for Heterogeneous Applications

arXiv.org Artificial Intelligence

The rapid growth of UAV applications necessitates a robust communication and networking architecture capable of addressing the diverse requirements of various applications concurrently, rather than relying on application-specific solutions. This paper proposes a generic and reliable multi-UAV communication and networking architecture designed to support the varying demands of heterogeneous applications, including short-range and long-range communication, star and mesh topologies, different data rates, and multiple wireless standards. Our architecture accommodates both adhoc and infrastructure networks, ensuring seamless connectivity throughout the network. Additionally, we present the design of a multi-protocol UAV gateway that enables interoperability among various communication protocols. Furthermore, we introduce a data processing and service layer framework with a graphical user interface of a ground control station that facilitates remote control and monitoring from any location at any time. We practically implemented the proposed architecture and evaluated its performance using different metrics, demonstrating its effectiveness.


Communication and Energy-Aware Multi-UAV Coverage Path Planning for Networked Operations

arXiv.org Artificial Intelligence

This paper presents a communication and energy-aware Multi-UAV Coverage Path Planning (mCPP) method for scenarios requiring continuous inter-UAV communication, such as cooperative search and rescue and surveillance missions. Unlike existing mCPP solutions that focus on energy, time, or coverage efficiency, our approach generates coverage paths that require minimal the communication range to maintain inter-UAV connectivity while also optimizing energy consumption. The mCPP problem is formulated as a multi-objective optimization task, aiming to minimize both the communication range requirement and energy consumption. Our approach significantly reduces the communication range needed for maintaining connectivity while ensuring energy efficiency, outperforming state-of-the-art methods. Its effectiveness is validated through simulations on complex and arbitrary shaped regions of interests, including scenarios with no-fly zones. Additionally, real-world experiment demonstrate its high accuracy, achieving 99\% consistency between the estimated and actual communication range required during a multi-UAV coverage mission involving three UAVs.


Low-complexity subspace-descent over symmetric positive definite manifold

arXiv.org Machine Learning

This work puts forth low-complexity Riemannian subspace descent algorithms for the minimization of functions over the symmetric positive definite (SPD) manifold. Different from the existing Riemannian gradient descent variants, the proposed approach utilizes carefully chosen subspaces that allow the update to be written as a product of the Cholesky factor of the iterate and a sparse matrix. The resulting updates avoid the costly matrix operations like matrix exponentiation and dense matrix multiplication, which are generally required in almost all other Riemannian optimization algorithms on SPD manifold. We further identify a broad class of functions, arising in diverse applications, such as kernel matrix learning, covariance estimation of Gaussian distributions, maximum likelihood parameter estimation of elliptically contoured distributions, and parameter estimation in Gaussian mixture model problems, over which the Riemannian gradients can be calculated efficiently. The proposed uni-directional and multi-directional Riemannian subspace descent variants incur per-iteration complexities of $O(n)$ and $O(n^2)$ respectively, as compared to the $O(n^3)$ or higher complexity incurred by all existing Riemannian gradient descent variants. The superior runtime and low per-iteration complexity of the proposed algorithms is also demonstrated via numerical tests on large-scale covariance estimation and matrix square root problems. MATLAB code implementation is publicly available on GitHub : https://github.com/yogeshd-iitk/subspace_descent_over_SPD_manifold


Sharpened Lazy Incremental Quasi-Newton Method

arXiv.org Artificial Intelligence

We consider the finite sum minimization of $n$ strongly convex and smooth functions with Lipschitz continuous Hessians in $d$ dimensions. In many applications where such problems arise, including maximum likelihood estimation, empirical risk minimization, and unsupervised learning, the number of observations $n$ is large, and it becomes necessary to use incremental or stochastic algorithms whose per-iteration complexity is independent of $n$. Of these, the incremental/stochastic variants of the Newton method exhibit superlinear convergence, but incur a per-iteration complexity of $O(d^3)$, which may be prohibitive in large-scale settings. On the other hand, the incremental Quasi-Newton method incurs a per-iteration complexity of $O(d^2)$ but its superlinear convergence rate has only been characterized asymptotically. This work puts forth the Sharpened Lazy Incremental Quasi-Newton (SLIQN) method that achieves the best of both worlds: an explicit superlinear convergence rate with a per-iteration complexity of $O(d^2)$. Building upon the recently proposed Sharpened Quasi-Newton method, the proposed incremental variant incorporates a hybrid update strategy incorporating both classic and greedy BFGS updates. The proposed lazy update rule distributes the computational complexity between the iterations, so as to enable a per-iteration complexity of $O(d^2)$. Numerical tests demonstrate the superiority of SLIQN over all other incremental and stochastic Quasi-Newton variants.


FedNew: A Communication-Efficient and Privacy-Preserving Newton-Type Method for Federated Learning

arXiv.org Machine Learning

While first-order methods for FL problems are extensively studied in the literature, recently a handful of second-order methods have been proposed for FL problems. The main advantage of second-order methods is their faster convergence rate, although they suffer from high communication costs. Besides that, sharing the first and second-order information (which contains client's data) may create a privacy issue. This is because the Hessian matrix contains valuable information about the properties of the local function/data, and when shared with the parameter server (PS), privacy may be violated by eavesdroppers or an honest-but-curious PS. For instance, authors in [Yin et al., 2014] show that the eigenvalues of the Hessian matrix can be used to extract important information of the input images. In this work, we are interested in developing communication efficient second-order methods while still preserving the privacy of the individual clients. To this end, we propose a novel framework that hides the gradients as well as the Hessians of the local functions, yet uses second-order information to solve the FL problem. In particular, we divide the standard Newton step into outer and inner levels. The objective of the inner level is to learn what we call the Hessian inverse-gradient product or Zeroth Hessian inverse-gradient product for the computation-efficient version (i.e., (


Projection-Free Algorithm for Stochastic Bi-level Optimization

arXiv.org Artificial Intelligence

This work presents the first projection-free algorithm to solve stochastic bi-level optimization problems, where the objective function depends on the solution of another stochastic optimization problem. The proposed $\textbf{S}$tochastic $\textbf{Bi}$-level $\textbf{F}$rank-$\textbf{W}$olfe ($\textbf{SBFW}$) algorithm can be applied to streaming settings and does not make use of large batches or checkpoints. The sample complexity of SBFW is shown to be $\mathcal{O}(\epsilon^{-3})$ for convex objectives and $\mathcal{O}(\epsilon^{-4})$ for non-convex objectives. Improved rates are derived for the stochastic compositional problem, which is a special case of the bi-level problem, and entails minimizing the composition of two expected-value functions. The proposed $\textbf{S}$tochastic $\textbf{C}$ompositional $\textbf{F}$rank-$\textbf{W}$olfe ($\textbf{SCFW}$) is shown to achieve a sample complexity of $\mathcal{O}(\epsilon^{-2})$ for convex objectives and $\mathcal{O}(\epsilon^{-3})$ for non-convex objectives, at par with the state-of-the-art sample complexities for projection-free algorithms solving single-level problems. We demonstrate the advantage of the proposed methods by solving the problem of matrix completion with denoising and the problem of policy value evaluation in reinforcement learning.


Consistent Online Gaussian Process Regression Without the Sample Complexity Bottleneck

arXiv.org Machine Learning

Gaussian processes provide a framework for nonlinear nonparametric Bayesian inference widely applicable across science and engineering. Unfortunately, their computational burden scales cubically with the training sample size, which in the case that samples arrive in perpetuity, approaches infinity. This issue necessitates approximations for use with streaming data, which to date mostly lack convergence guarantees. Thus, we develop the first online Gaussian process approximation that preserves convergence to the population posterior, i.e., asymptotic posterior consistency, while ameliorating its intractable complexity growth with the sample size. We propose an online compression scheme that, following each a posteriori update, fixes an error neighborhood with respect to the Hellinger metric centered at the current posterior, and greedily tosses out past kernel dictionary elements until its boundary is hit. We call the resulting method Parsimonious Online Gaussian Processes (POG). For diminishing error radius, exact asymptotic consistency is preserved (Theorem 1(i)) at the cost of unbounded memory in the limit. On the other hand, for constant error radius, POG converges to a neighborhood of the population posterior (Theorem 1(ii))but with finite memory at-worst determined by the metric entropy of the feature space (Theorem 2). Experimental results are presented on several nonlinear regression problems which illuminates the merits of this approach as compared with alternatives that fix the subspace dimension defining the history of past points.


Zeroth and First Order Stochastic Frank-Wolfe Algorithms for Constrained Optimization

arXiv.org Machine Learning

This paper considers stochastic convex optimization problems with two sets of constraints: (a) deterministic constraints on the domain of the optimization variable, which are difficult to project onto; and (b) deterministic or stochastic constraints that admit efficient projection. Problems of this form arise frequently in the context of semidefinite programming as well as when various NP-hard problems are solved approximately via semidefinite relaxation. Since projection onto the first set of constraints is difficult, it becomes necessary to explore projection-free algorithms, such as the stochastic Frank-Wolfe (FW) algorithm. On the other hand, the second set of constraints cannot be handled in the same way, and must be incorporated as an indicator function within the objective function, thereby complicating the application of FW methods. Similar problems have been studied before, and solved using first-order stochastic FW algorithms by applying homotopy and Nesterov's smoothing techniques to the indicator function. This work improves upon these existing results and puts forth momentum-based first-order methods that yield improved convergence rates, at par with the best known rates for problems without the second set of constraints. Zeroth-order variants of the proposed algorithms are also developed and again improve upon the state-of-the-art rate results. The efficacy of the proposed algorithms is tested on relevant applications of sparse matrix estimation, clustering via semidefinite relaxation, and uniform sparsest cut problem.


STEM: A Stochastic Two-Sided Momentum Algorithm Achieving Near-Optimal Sample and Communication Complexities for Federated Learning

arXiv.org Machine Learning

Federated Learning (FL) refers to the paradigm where multiple worker nodes (WNs) build a joint model by using local data. Despite extensive research, for a generic non-convex FL problem, it is not clear, how to choose the WNs' and the server's update directions, the minibatch sizes, and the local update frequency, so that the WNs use the minimum number of samples and communication rounds to achieve the desired solution. This work addresses the above question and considers a class of stochastic algorithms where the WNs perform a few local updates before communication. We show that when both the WN's and the server's directions are chosen based on a stochastic momentum estimator, the algorithm requires $\tilde{\mathcal{O}}(\epsilon^{-3/2})$ samples and $\tilde{\mathcal{O}}(\epsilon^{-1})$ communication rounds to compute an $\epsilon$-stationary solution. To the best of our knowledge, this is the first FL algorithm that achieves such {\it near-optimal} sample and communication complexities simultaneously. Further, we show that there is a trade-off curve between local update frequencies and local minibatch sizes, on which the above sample and communication complexities can be maintained. Finally, we show that for the classical FedAvg (a.k.a. Local SGD, which is a momentum-less special case of the STEM), a similar trade-off curve exists, albeit with worse sample and communication complexities. Our insights on this trade-off provides guidelines for choosing the four important design elements for FL algorithms, the update frequency, directions, and minibatch sizes to achieve the best performance.


Adaptive Kernel Learning in Heterogeneous Networks

arXiv.org Machine Learning

We consider the framework of learning over decentralized networks, where nodes observe unique, possibly correlated, observation streams. We focus on the case where agents learn a regression \emph{function} that belongs to a reproducing kernel Hilbert space (RKHS). In this setting, a decentralized network aims to learn nonlinear statistical models that are optimal in terms of a global stochastic convex functional that aggregates data across the network, with only access to a local data stream. We incentivize coordination while respecting network heterogeneity through the introduction of nonlinear proximity constraints. To solve it, we propose applying a functional variant of stochastic primal-dual (Arrow-Hurwicz) method which yields a decentralized algorithm. To handle the fact that the RKHS parameterization has complexity proportionate with the iteration index, we project the primal iterates onto Hilbert subspaces that are greedily constructed from the observation sequence of each node. The resulting proximal stochastic variant of Arrow-Hurwicz, dubbed Heterogeneous Adaptive Learning with Kernels (HALK), is shown to converge in expectation, both in terms of primal sub-optimality and constraint violation to a neighborhood that depends on a given constant step-size selection. Simulations on a correlated spatio-temporal random field estimation problem validate our theoretical results, which are born out in practice for networked oceanic sensing buoys estimating temperature and salinity from depth measurements.