Mathematical & Statistical Methods
Scientists discover for the first time that sperm defy one of Newton's laws of PHYSICS
Scientists have discovered that the way sperms swim defies Newton's law of motion, which states there is an equal and opposite reaction Researchers at Kyoto University found the sperms' flagella, or tail, propels the agents forward by changing their shape to interact with the fluid. Sperms do so in a non-reciprocal way, which violates Newton's third law because they do not elicit an equal and opposite reaction from their surroundings. The flagellum's elasticity also suggests that there should be no movement at all, but instead, sperms whip their tails without releasing much energy into their surroundings. Researchers at Kyoto University found the sperms' flagella, or tail, propels the agents forward by changing their shape to interact with the fluid The team used human sperm cells and algae for the research because both have flagella that help them propel through the liquid, New Scientist reports. Men's bulging waistlines are blamed for the worrying trend and'everywhere chemicals' in the environment.
Sketch-and-Project Meets Newton Method: Global $\mathcal O(k^{-2})$ Convergence with Low-Rank Updates
In this paper, we propose the first sketch-and-project Newton method with fast $\mathcal O(k^{-2})$ global convergence rate for self-concordant functions. Our method, SGN, can be viewed in three ways: i) as a sketch-and-project algorithm projecting updates of Newton method, ii) as a cubically regularized Newton ethod in sketched subspaces, and iii) as a damped Newton method in sketched subspaces. SGN inherits best of all three worlds: cheap iteration costs of sketch-and-project methods, state-of-the-art $\mathcal O(k^{-2})$ global convergence rate of full-rank Newton-like methods and the algorithm simplicity of damped Newton methods. Finally, we demonstrate its comparable empirical performance to baseline algorithms.
Learning Continuous Network Emerging Dynamics from Scarce Observations via Data-Adaptive Stochastic Processes
Cui, Jiaxu, Sun, Bingyi, Liu, Jiming, Yang, Bo
Learning network dynamics from the empirical structure and spatio-temporal observation data is crucial to revealing the interaction mechanisms of complex networks in a wide range of domains. However, most existing methods only aim at learning network dynamic behaviors generated by a specific ordinary differential equation instance, resulting in ineffectiveness for new ones, and generally require dense observations. The observed data, especially from network emerging dynamics, are usually difficult to obtain, which brings trouble to model learning. Therefore, how to learn accurate network dynamics with sparse, irregularly-sampled, partial, and noisy observations remains a fundamental challenge. We introduce Neural ODE Processes for Network Dynamics (NDP4ND), a new class of stochastic processes governed by stochastic data-adaptive network dynamics, to overcome the challenge and learn continuous network dynamics from scarce observations. Intensive experiments conducted on various network dynamics in ecological population evolution, phototaxis movement, brain activity, epidemic spreading, and real-world empirical systems, demonstrate that the proposed method has excellent data adaptability and computational efficiency, and can adapt to unseen network emerging dynamics, producing accurate interpolation and extrapolation with reducing the ratio of required observation data to only about 6\% and improving the learning speed for new dynamics by three orders of magnitude.
Data-Driven Modeling and Analysis of Transmission Error in Harmonic Drive Systems: Nonlinear Dynamics, Error Modeling, and Compensation Techniques
Harmonic drive systems (HDS) are high-precision robotic transmissions featuring compact size and high gear ratios. However, issues like kinematic transmission errors hamper their precision performance. This article focuses on data-driven modeling and analysis of an HDS to improve kinematic error compensation. The background introduces HDS mechanics, nonlinear attributes, and modeling approaches from literature. The HDS dynamics are derived using Lagrange equations. Experiments under aggressive conditions provide training data exhibiting deterministic patterns. Various linear and nonlinear models have been developed. The best-performing model, based on a nonlinear neural network, achieves over 98\% accuracy for one-step predictions on both the training and validation data sets. A phenomenological model separates the kinematic error into a periodic pure part and flexible part. Apart from implementation of estimated transmission error injection compensation, novel compensation mechanisms policies for the kinematic error are analyzed and proposed, including nonlinear model predictive control and frequency loop-shaping. The feedback loop is analyzed to select the controller for vibration mitigation. Main contributions include the nonlinear dynamics derivation, data-driven nonlinear modeling of flexible kinematic errors, repeatable experiment design, and proposed novel compensation mechanism and policies. Future work involves using physics-informed neural networks, sensitivity analysis, full life-cycle monitoring, and extracting physical laws directly from data.
AutoTrans: A Complete Planning and Control Framework for Autonomous UAV Payload Transportation
Li, Haojia, Wang, Haokun, Feng, Chen, Gao, Fei, Zhou, Boyu, Shen, Shaojie
The robotics community is increasingly interested in autonomous aerial transportation. Unmanned aerial vehicles with suspended payloads have advantages over other systems, including mechanical simplicity and agility, but pose great challenges in planning and control. To realize fully autonomous aerial transportation, this paper presents a systematic solution to address these difficulties. First, we present a real-time planning method that generates smooth trajectories considering the time-varying shape and non-linear dynamics of the system, ensuring whole-body safety and dynamic feasibility. Additionally, an adaptive NMPC with a hierarchical disturbance compensation strategy is designed to overcome unknown external perturbations and inaccurate model parameters. Extensive experiments show that our method is capable of generating high-quality trajectories online, even in highly constrained environments, and tracking aggressive flight trajectories accurately, even under significant uncertainty. We plan to release our code to benefit the community.
Predicting Accurate Lagrangian Multipliers for Mixed Integer Linear Programs
Demelas, Francesco, Roux, Joseph Le, Lacroix, Mathieu, Parmentier, Axel
Lagrangian relaxation stands among the most efficient approaches for solving a Mixed Integer Linear Programs (MILP) with difficult constraints. Given any duals for these constraints, called Lagrangian Multipliers (LMs), it returns a bound on the optimal value of the MILP, and Lagrangian methods seek the LMs giving the best such bound. But these methods generally rely on iterative algorithms resembling gradient descent to maximize the concave piecewise linear dual function: the computational burden grows quickly with the number of relaxed constraints. We introduce a deep learning approach that bypasses the descent, effectively amortizing the local, per instance, optimization. A probabilistic encoder based on a graph convolutional network computes high-dimensional representations of relaxed constraints in MILP instances. A decoder then turns these representations into LMs. We train the encoder and decoder jointly by directly optimizing the bound obtained from the predicted multipliers. Numerical experiments show that our approach closes up to 85~\% of the gap between the continuous relaxation and the best Lagrangian bound, and provides a high quality warm-start for descent based Lagrangian methods.
Promoting Generalization for Exact Solvers via Adversarial Instance Augmentation
Liu, Haoyang, Kuang, Yufei, Wang, Jie, Li, Xijun, Zhang, Yongdong, Wu, Feng
Machine learning has been successfully applied to improve the efficiency of Mixed-Integer Linear Programming (MILP) solvers. However, the learning-based solvers often suffer from severe performance degradation on unseen MILP instances -- especially on large-scale instances from a perturbed environment -- due to the limited diversity of training distributions. To tackle this problem, we propose a novel approach, which is called Adversarial Instance Augmentation and does not require to know the problem type for new instance generation, to promote data diversity for learning-based branching modules in the branch-and-bound (B&B) Solvers (AdaSolver). We use the bipartite graph representations for MILP instances and obtain various perturbed instances to regularize the solver by augmenting the graph structures with a learned augmentation policy. The major technical contribution of AdaSolver is that we formulate the non-differentiable instance augmentation as a contextual bandit problem and adversarially train the learning-based solver and augmentation policy, enabling efficient gradient-based training of the augmentation policy. To the best of our knowledge, AdaSolver is the first general and effective framework for understanding and improving the generalization of both imitation-learning-based (IL-based) and reinforcement-learning-based (RL-based) B&B solvers. Extensive experiments demonstrate that by producing various augmented instances, AdaSolver leads to a remarkable efficiency improvement across various distributions.
Submodular Maximization subject to a Knapsack Constraint: Combinatorial Algorithms with Near-optimal Adaptive Complexity
Amanatidis, Georgios, Fusco, Federico, Lazos, Philip, Leonardi, Stefano, Spaccamela, Alberto Marchetti, Reiffenhäuser, Rebecca
Submodular maximization is a classic algorithmic problem with multiple applications in data mining and machine learning; there, the growing need to deal with massive instances motivates the design of algorithms balancing the quality of the solution with applicability. For the latter, an important measure is the adaptive complexity, which captures the number of sequential rounds of parallel computation needed by an algorithm to terminate. In this work we obtain the first constant factor approximation algorithm for non-monotone submodular maximization subject to a knapsack constraint with near-optimal $O(\log n)$ adaptive complexity. Low adaptivity by itself, however, is not enough: a crucial feature to account for is represented by the total number of function evaluations (or value queries). Our algorithm asks $\tilde{O}(n^2)$ value queries, but can be modified to run with only $\tilde{O}(n)$ instead, while retaining a low adaptive complexity of $O(\log^2n)$. Besides the above improvement in adaptivity, this is also the first combinatorial approach with sublinear adaptive complexity for the problem and yields algorithms comparable to the state-of-the-art even for the special cases of cardinality constraints or monotone objectives.
On the Overlooked Structure of Stochastic Gradients
Xie, Zeke, Tang, Qian-Yuan, Sun, Mingming, Li, Ping
Stochastic gradients closely relate to both optimization and generalization of deep neural networks (DNNs). Some works attempted to explain the success of stochastic optimization for deep learning by the arguably heavy-tail properties of gradient noise, while other works presented theoretical and empirical evidence against the heavy-tail hypothesis on gradient noise. Unfortunately, formal statistical tests for analyzing the structure and heavy tails of stochastic gradients in deep learning are still under-explored. In this paper, we mainly make two contributions. First, we conduct formal statistical tests on the distribution of stochastic gradients and gradient noise across both parameters and iterations. Our statistical tests reveal that dimension-wise gradients usually exhibit power-law heavy tails, while iteration-wise gradients and stochastic gradient noise caused by minibatch training usually do not exhibit power-law heavy tails. Second, we further discover that the covariance spectra of stochastic gradients have the power-law structures overlooked by previous studies and present its theoretical implications for training of DNNs. While previous studies believed that the anisotropic structure of stochastic gradients matters to deep learning, they did not expect the gradient covariance can have such an elegant mathematical structure. Our work challenges the existing belief and provides novel insights on the structure of stochastic gradients in deep learning.
DIG-MILP: a Deep Instance Generator for Mixed-Integer Linear Programming with Feasibility Guarantee
Wang, Haoyu, Liu, Jialin, Chen, Xiaohan, Wang, Xinshang, Li, Pan, Yin, Wotao
Mixed-integer linear programming (MILP) stands as a notable NP-hard problem pivotal to numerous crucial industrial applications. The development of effective algorithms, the tuning of solvers, and the training of machine learning models for MILP resolution all hinge on access to extensive, diverse, and representative data. Yet compared to the abundant naturally occurring data in image and text realms, MILP is markedly data deficient, underscoring the vital role of synthetic MILP generation. We present DIG-MILP, a deep generative framework based on variational auto-encoder (VAE), adept at extracting deep-level structural features from highly limited MILP data and producing instances that closely mirror the target data. Notably, by leveraging the MILP duality, DIG-MILP guarantees a correct and complete generation space as well as ensures the boundedness and feasibility of the generated instances. Our empirical study highlights the novelty and quality of the instances generated by DIG-MILP through two distinct downstream tasks: (S1) Data sharing, where solver solution times correlate highly positive between original and DIG-MILP-generated instances, allowing data sharing for solver tuning without publishing the original data; (S2) Data Augmentation, wherein the DIG-MILP-generated instances bolster the generalization performance of machine learning models tasked with resolving MILP problems.