Mathematical & Statistical Methods
The Single Robot Line Coverage Problem: Theory, Algorithms, and Experiments
Agarwal, Saurav, Akella, Srinivas
Line coverage is the task of servicing a given set of one-dimensional features in an environment. It is important for the inspection of linear infrastructure such as road networks, power lines, and oil and gas pipelines. This paper addresses the single robot line coverage problem for aerial and ground robots by modeling it as an optimization problem on a graph. The problem belongs to the broad class of arc routing problems and is closely related to the rural postman problem (RPP) on asymmetric graphs. The paper presents an integer linear programming formulation with proofs of correctness. Using the minimum cost flow problem, we develop approximation algorithms with guarantees on the solution quality. These guarantees also improve the existing results for the asymmetric RPP. The main algorithm partitions the problem into three cases based on the structure of the required graph, i.e., the graph induced by the features that require servicing. We evaluate our algorithms on road networks from the 50 most populous cities in the world, consisting of up to 730 road segments. The algorithms, augmented with improvement heuristics, run within 3s and generate solutions that are within 10% of the optimum. We experimentally demonstrate our algorithms with commercial UAVs.
Enjoy the Silence: Analysis of Stochastic Petri Nets with Silent Transitions
Leemans, Sander J. J., Maggi, Fabrizio M., Montali, Marco
Capturing stochastic behaviors in business and work processes is essential to quantitatively understand how nondeterminism is resolved when taking decisions within the process. This is of special interest in process mining, where event data tracking the actual execution of the process are related to process models, and can then provide insights on frequencies and probabilities. Variants of stochastic Petri nets provide a natural formal basis for this. However, when capturing processes, such nets need to be labelled with (possibly duplicated) activities, and equipped with silent transitions that model internal, non-logged steps related to the orchestration of the process. At the same time, they have to be analyzed in a finite-trace semantics, matching the fact that each process execution consists of finitely many steps. These two aspects impede the direct application of existing techniques for stochastic Petri nets, calling for a novel characterization that incorporates labels and silent transitions in a finite-trace semantics. In this article, we provide such a characterization starting from generalized stochastic Petri nets and obtaining the framework of labelled stochastic processes (LSPs). On top of this framework, we introduce different key analysis tasks on the traces of LSPs and their probabilities. We show that all such analysis tasks can be solved analytically, in particular reducing them to a single method that combines automata-based techniques to single out the behaviors of interest within a LSP, with techniques based on absorbing Markov chains to reason on their probabilities. Finally, we demonstrate the significance of how our approach in the context of stochastic conformance checking, illustrating practical feasibility through a proof-of-concept implementation and its application to different datasets.
Deterministic Random Walk Model in NetLogo and the Identification of Asymmetric Saturation Time in Random Graph
Chatterjee, Ayan, Cao, Qingtao, Sajadi, Amirhossein, Ravandi, Babak
Interactive programming environments are powerful tools for promoting innovative network thinking, teaching science of complexity, and exploring emergent phenomena. This paper reports on our recent development of the deterministic random walk model in NetLogo, a leading platform for computational thinking, eco-system thinking, and multi-agent cross-platform programming environment. The deterministic random walk is foundational to modeling dynamical processes on complex networks. Inspired by the temporal visualizations offered in NetLogo, we investigated the relationship between network topology and diffusion saturation time for the deterministic random walk model. Our analysis uncovers that in Erd\H{o}s-R\'{e}nyi graphs, the saturation time exhibits an asymmetric pattern with a considerable probability of occurrence. This behavior occurs when the hubs, defined as nodes with relatively higher number of connections, emerge in Erd\H{o}s-R\'{e}nyi graphs. Yet, our analysis yields that the hubs in Barab\'{a}si-Albert model stabilize the the convergence time of the deterministic random walk model. These findings strongly suggest that depending on the dynamical process running on complex networks, complementing characteristics other than the degree need to be taken into account for considering a node as a hub. We have made our development open-source, available to the public at no cost at https://github.com/bravandi/NetLogo-Dynamical-Processes.
Tighter Lower Bounds for Shuffling SGD: Random Permutations and Beyond
Cha, Jaeyoung, Lee, Jaewook, Yun, Chulhee
We study convergence lower bounds of without-replacement stochastic gradient descent (SGD) for solving smooth (strongly-)convex finite-sum minimization problems. Unlike most existing results focusing on final iterate lower bounds in terms of the number of components $n$ and the number of epochs $K$, we seek bounds for arbitrary weighted average iterates that are tight in all factors including the condition number $\kappa$. For SGD with Random Reshuffling, we present lower bounds that have tighter $\kappa$ dependencies than existing bounds. Our results are the first to perfectly close the gap between lower and upper bounds for weighted average iterates in both strongly-convex and convex cases. We also prove weighted average iterate lower bounds for arbitrary permutation-based SGD, which apply to all variants that carefully choose the best permutation. Our bounds improve the existing bounds in factors of $n$ and $\kappa$ and thereby match the upper bounds shown for a recently proposed algorithm called GraB.
Approximate Newton policy gradient algorithms
Li, Haoya, Gupta, Samarth, Yu, Hsiangfu, Ying, Lexing, Dhillon, Inderjit
Policy gradient algorithms have been widely applied to Markov decision processes and reinforcement learning problems in recent years. Regularization with various entropy functions is often used to encourage exploration and improve stability. This paper proposes an approximate Newton method for the policy gradient algorithm with entropy regularization. In the case of Shannon entropy, the resulting algorithm reproduces the natural policy gradient algorithm. For other entropy functions, this method results in brand-new policy gradient algorithms. We prove that all these algorithms enjoy Newton-type quadratic convergence and that the corresponding gradient flow converges globally to the optimal solution. We use synthetic and industrial-scale examples to demonstrate that the proposed approximate Newton method typically converges in single-digit iterations, often orders of magnitude faster than other state-of-the-art algorithms.
Sketching for First Order Method: Efficient Algorithm for Low-Bandwidth Channel and Vulnerability
Song, Zhao, Wang, Yitan, Yu, Zheng, Zhang, Lichen
Sketching is one of the most fundamental tools in large-scale machine learning. It enables runtime and memory saving via randomly compressing the original large problem into lower dimensions. In this paper, we propose a novel sketching scheme for the first order method in large-scale distributed learning setting, such that the communication costs between distributed agents are saved while the convergence of the algorithms is still guaranteed. Given gradient information in a high dimension $d$, the agent passes the compressed information processed by a sketching matrix $R\in \mathbb{R}^{s\times d}$ with $s\ll d$, and the receiver de-compressed via the de-sketching matrix $R^\top$ to ``recover'' the information in original dimension. Using such a framework, we develop algorithms for federated learning with lower communication costs. However, such random sketching does not protect the privacy of local data directly. We show that the gradient leakage problem still exists after applying the sketching technique by presenting a specific gradient attack method. As a remedy, we prove rigorously that the algorithm will be differentially private by adding additional random noises in gradient information, which results in a both communication-efficient and differentially private first order approach for federated learning tasks. Our sketching scheme can be further generalized to other learning settings and might be of independent interest itself.
K-SpecPart: Supervised embedding algorithms and cut overlay for improved hypergraph partitioning
Bustany, Ismail, Kahng, Andrew B., Koutis, Ioannis, Pramanik, Bodhisatta, Wang, Zhiang
State-of-the-art hypergraph partitioners follow the multilevel paradigm that constructs multiple levels of progressively coarser hypergraphs that are used to drive cut refinement on each level of the hierarchy. Multilevel partitioners are subject to two limitations: (i) hypergraph coarsening processes rely on local neighborhood structure without fully considering the global structure of the hypergraph; and (ii) refinement heuristics risk entrapment in local minima. In this paper, we describe K-SpecPart, a supervised spectral framework for multi-way partitioning that directly tackles these two limitations. K-SpecPart relies on the computation of generalized eigenvectors and supervised dimensionality reduction techniques to generate vertex embeddings. These are computational primitives that are fast and capture global structural properties of the hypergraph that are not explicitly considered by existing partitioners. K-SpecPart then converts the vertex embeddings into multiple partitioning solutions. K-SpecPart introduces the idea of ''ensembling'' multiple solutions via a cut-overlay clustering technique that often enables the use of computationally demanding partitioning methods such as ILP (integer linear programming). Using the output of a standard partitioner as a supervision hint, K-SpecPart effectively combines the strengths of established multilevel partitioning techniques with the benefits of spectral graph theory and other combinatorial algorithms. K-SpecPart significantly extends ideas and algorithms that first appeared in our previous work on the bipartitioner SpecPart. Our experiments demonstrate the effectiveness of K-SpecPart. For bipartitioning, K-SpecPart produces solutions with up to 15% cutsize improvement over SpecPart. For multi-way partitioning, K-SpecPart produces solutions with up to 20% cutsize improvement over leading partitioners hMETIS and KaHyPar.
Convergence of the Inexact Langevin Algorithm and Score-based Generative Models in KL Divergence
Yang, Kaylee Yingxi, Wibisono, Andre
We study the Inexact Langevin Dynamics (ILD), Inexact Langevin Algorithm (ILA), and Score-based Generative Modeling (SGM) when utilizing estimated score functions for sampling. Our focus lies in establishing stable biased convergence guarantees in terms of the Kullback-Leibler (KL) divergence. To achieve these guarantees, we impose two key assumptions: 1) the target distribution satisfies the log-Sobolev inequality (LSI), and 2) the score estimator exhibits a bounded Moment Generating Function (MGF) error. Notably, the MGF error assumption we adopt is more lenient compared to the $L^\infty$ error assumption used in existing literature. However, it is stronger than the $L^2$ error assumption utilized in recent works, which often leads to unstable bounds. We explore the question of how to obtain a provably accurate score estimator that satisfies the MGF error assumption. Specifically, we demonstrate that a simple estimator based on kernel density estimation fulfills the MGF error assumption for sub-Gaussian target distribution, at the population level.
Fast $(1+\varepsilon)$-Approximation Algorithms for Binary Matrix Factorization
Velingker, Ameya, Vötsch, Maximilian, Woodruff, David P., Zhou, Samson
We introduce efficient $(1+\varepsilon)$-approximation algorithms for the binary matrix factorization (BMF) problem, where the inputs are a matrix $\mathbf{A}\in\{0,1\}^{n\times d}$, a rank parameter $k>0$, as well as an accuracy parameter $\varepsilon>0$, and the goal is to approximate $\mathbf{A}$ as a product of low-rank factors $\mathbf{U}\in\{0,1\}^{n\times k}$ and $\mathbf{V}\in\{0,1\}^{k\times d}$. Equivalently, we want to find $\mathbf{U}$ and $\mathbf{V}$ that minimize the Frobenius loss $\|\mathbf{U}\mathbf{V} - \mathbf{A}\|_F^2$. Before this work, the state-of-the-art for this problem was the approximation algorithm of Kumar et. al. [ICML 2019], which achieves a $C$-approximation for some constant $C\ge 576$. We give the first $(1+\varepsilon)$-approximation algorithm using running time singly exponential in $k$, where $k$ is typically a small integer. Our techniques generalize to other common variants of the BMF problem, admitting bicriteria $(1+\varepsilon)$-approximation algorithms for $L_p$ loss functions and the setting where matrix operations are performed in $\mathbb{F}_2$. Our approach can be implemented in standard big data models, such as the streaming or distributed models.
Consistent and fast inference in compartmental models of epidemics using Poisson Approximate Likelihoods
Whitehouse, Michael, Whiteley, Nick, Rimella, Lorenzo
Addressing the challenge of scaling-up epidemiological inference to complex and heterogeneous models, we introduce Poisson Approximate Likelihood (PAL) methods. In contrast to the popular ODE approach to compartmental modelling, in which a large population limit is used to motivate a deterministic model, PALs are derived from approximate filtering equations for finite-population, stochastic compartmental models, and the large population limit drives consistency of maximum PAL estimators. Our theoretical results appear to be the first likelihood-based parameter estimation consistency results which apply to a broad class of partially observed stochastic compartmental models and address the large population limit. PALs are simple to implement, involving only elementary arithmetic operations and no tuning parameters, and fast to evaluate, requiring no simulation from the model and having computational cost independent of population size. Through examples we demonstrate how PALs can be used to: fit an age-structured model of influenza, taking advantage of automatic differentiation in Stan; compare over-dispersion mechanisms in a model of rotavirus by embedding PALs within sequential Monte Carlo; and evaluate the role of unit-specific parameters in a meta-population model of measles.