Mathematical & Statistical Methods
Using Graph Theory for Improving Machine Learning-based Detection of Cyber Attacks
Zonneveld, Giacomo, Principi, Lorenzo, Baldi, Marco
Early detection of network intrusions and cyber threats is one of the main pillars of cybersecurity. One of the most effective approaches for this purpose is to analyze network traffic with the help of artificial intelligence algorithms, with the aim of detecting the possible presence of an attacker by distinguishing it from a legitimate user. This is commonly done by collecting the traffic exchanged between terminals in a network and analyzing it on a per-packet or per-connection basis. In this paper, we propose instead to perform pre-processing of network traffic under analysis with the aim of extracting some new metrics on which we can perform more efficient detection and overcome some limitations of classical approaches. These new metrics are based on graph theory, and consider the network as a whole, rather than focusing on individual packets or connections. Our approach is validated through experiments performed on publicly available data sets, from which it results that it can not only overcome some of the limitations of classical approaches, but also achieve a better detection capability of cyber threats.
Global optimality under amenable symmetry constraints
We ask whether there exists a function or measure that (1) minimizes a given convex functional or risk and (2) satisfies a symmetry property specified by an amenable group of transformations. Examples of such symmetry properties are invariance, equivariance, or quasi-invariance. Our results draw on old ideas of Stein and Le Cam and on approximate group averages that appear in ergodic theorems for amenable groups. A class of convex sets known as orbitopes in convex analysis emerges as crucial, and we establish properties of such orbitopes in nonparametric settings. We also show how a simple device called a cocycle can be used to reduce different forms of symmetry to a single problem. As applications, we obtain results on invariant kernel mean embeddings and a Monge-Kantorovich theorem on optimality of transport plans under symmetry constraints. We also explain connections to the Hunt-Stein theorem on invariant tests.
Logistic-beta processes for modeling dependent random probabilities with beta marginals
Lee, Changwoo J., Zito, Alessandro, Sang, Huiyan, Dunson, David B.
The beta distribution serves as a canonical tool for modeling probabilities and is extensively used in statistics and machine learning, especially in the field of Bayesian nonparametrics. Despite its widespread use, there is limited work on flexible and computationally convenient stochastic process extensions for modeling dependent random probabilities. We propose a novel stochastic process called the logistic-beta process, whose logistic transformation yields a stochastic process with common beta marginals. Similar to the Gaussian process, the logistic-beta process can model dependence on both discrete and continuous domains, such as space or time, and has a highly flexible dependence structure through correlation kernels. Moreover, its normal variance-mean mixture representation leads to highly effective posterior inference algorithms. The flexibility and computational benefits of logistic-beta processes are demonstrated through nonparametric binary regression simulation studies. Furthermore, we apply the logistic-beta process in modeling dependent Dirichlet processes, and illustrate its application and benefits through Bayesian density regression problems in a toxicology study.
Quantum Speedup for Spectral Approximation of Kronecker Products
Gao, Yeqi, Song, Zhao, Zhang, Ruizhe
Given its widespread application in machine learning and optimization, the Kronecker product emerges as a pivotal linear algebra operator. However, its computational demands render it an expensive operation, leading to heightened costs in spectral approximation of it through traditional computation algorithms. Existing classical methods for spectral approximation exhibit a linear dependency on the matrix dimension denoted by $n$, considering matrices of size $A_1 \in \mathbb{R}^{n \times d}$ and $A_2 \in \mathbb{R}^{n \times d}$. Our work introduces an innovative approach to efficiently address the spectral approximation of the Kronecker product $A_1 \otimes A_2$ using quantum methods. By treating matrices as quantum states, our proposed method significantly reduces the time complexity of spectral approximation to $O_{d,\epsilon}(\sqrt{n})$.
Safe Guaranteed Exploration for Non-linear Systems
Prajapat, Manish, Köhler, Johannes, Turchetta, Matteo, Krause, Andreas, Zeilinger, Melanie N.
Safely exploring environments with a-priori unknown constraints is a fundamental challenge that restricts the autonomy of robots. While safety is paramount, guarantees on sufficient exploration are also crucial for ensuring autonomous task completion. To address these challenges, we propose a novel safe guaranteed exploration framework using optimal control, which achieves first-of-its-kind results: guaranteed exploration for non-linear systems with finite time sample complexity bounds, while being provably safe with arbitrarily high probability. The framework is general and applicable to many real-world scenarios with complex non-linear dynamics and unknown domains. Based on this framework we propose an efficient algorithm, SageMPC, SAfe Guaranteed Exploration using Model Predictive Control. SageMPC improves efficiency by incorporating three techniques: i) exploiting a Lipschitz bound, ii) goal-directed exploration, and iii) receding horizon style re-planning, all while maintaining the desired sample complexity, safety and exploration guarantees of the framework. Lastly, we demonstrate safe efficient exploration in challenging unknown environments using SageMPC with a car model.
Machine Learning Augmented Branch and Bound for Mixed Integer Linear Programming
Scavuzzo, Lara, Aardal, Karen, Lodi, Andrea, Yorke-Smith, Neil
Mixed Integer Linear Programming (MILP) is a pillar of mathematical optimization that offers a powerful modeling language for a wide range of applications. During the past decades, enormous algorithmic progress has been made in solving MILPs, and many commercial and academic software packages exist. Nevertheless, the availability of data, both from problem instances and from solvers, and the desire to solve new problems and larger (real-life) instances, trigger the need for continuing algorithmic development. MILP solvers use branch and bound as their main component. In recent years, there has been an explosive development in the use of machine learning algorithms for enhancing all main tasks involved in the branch-and-bound algorithm, such as primal heuristics, branching, cutting planes, node selection and solver configuration decisions. This paper presents a survey of such approaches, addressing the vision of integration of machine learning and mathematical optimization as complementary technologies, and how this integration can benefit MILP solving. In particular, we give detailed attention to machine learning algorithms that automatically optimize some metric of branch-and-bound efficiency. We also address how to represent MILPs in the context of applying learning algorithms, MILP benchmarks and software.
Continuous Multidimensional Scaling
Trosset, Michael W., Priebe, Carey E.
Multidimensional scaling (MDS) is the act of embedding proximity information about a set of $n$ objects in $d$-dimensional Euclidean space. As originally conceived by the psychometric community, MDS was concerned with embedding a fixed set of proximities associated with a fixed set of objects. Modern concerns, e.g., that arise in developing asymptotic theories for statistical inference on random graphs, more typically involve studying the limiting behavior of a sequence of proximities associated with an increasing set of objects. Standard results from the theory of point-to-set maps imply that, if $n$ is fixed and a sequence of proximities converges, then the limit of the embedded structures is the embedded structure of the limiting proximities. But what if $n$ increases? It then becomes necessary to reformulate MDS so that the entire sequence of embedding problems can be viewed as a sequence of optimization problems in a fixed space. We present such a reformulation and derive some consequences.
Curriculum reinforcement learning for quantum architecture search under hardware errors
Patel, Yash J., Kundu, Akash, Ostaszewski, Mateusz, Bonet-Monroig, Xavier, Dunjko, Vedran, Danaci, Onur
The key challenge in the noisy intermediate-scale quantum era is finding useful circuits compatible with current device limitations. Variational quantum algorithms (VQAs) offer a potential solution by fixing the circuit architecture and optimizing individual gate parameters in an external loop. However, parameter optimization can become intractable, and the overall performance of the algorithm depends heavily on the initially chosen circuit architecture. Several quantum architecture search (QAS) algorithms have been developed to design useful circuit architectures automatically. In the case of parameter optimization alone, noise effects have been observed to dramatically influence the performance of the optimizer and final outcomes, which is a key line of study. However, the effects of noise on the architecture search, which could be just as critical, are poorly understood. This work addresses this gap by introducing a curriculum-based reinforcement learning QAS (CRLQAS) algorithm designed to tackle challenges in realistic VQA deployment. The algorithm incorporates (i) a 3D architecture encoding and restrictions on environment dynamics to explore the search space of possible circuits efficiently, (ii) an episode halting scheme to steer the agent to find shorter circuits, and (iii) a novel variant of simultaneous perturbation stochastic approximation as an optimizer for faster convergence. To facilitate studies, we developed an optimized simulator for our algorithm, significantly improving computational efficiency in simulating noisy quantum circuits by employing the Pauli-transfer matrix formalism in the Pauli-Liouville basis. Numerical experiments focusing on quantum chemistry tasks demonstrate that CRLQAS outperforms existing QAS algorithms across several metrics in both noiseless and noisy environments.
Variational Flow Models: Flowing in Your Style
Do, Kien, Kieu, Duc, Nguyen, Toan, Nguyen, Dang, Le, Hung, Nguyen, Dung, Nguyen, Thin
We introduce a variational inference interpretation for models of "posterior flows" - generalizations of "probability flows" to a broader class of stochastic processes not necessarily diffusion processes. We coin the resulting models as "Variational Flow Models". Additionally, we propose a systematic training-free method to transform the posterior flow of a "linear" stochastic process characterized by the equation Xt = at * X0 + st * X1 into a straight constant-speed (SC) flow, reminiscent of Rectified Flow. This transformation facilitates fast sampling along the original posterior flow without training a new model of the SC flow. The flexibility of our approach allows us to extend our transformation to inter-convert two posterior flows from distinct "linear" stochastic processes. Moreover, we can easily integrate high-order numerical solvers into the transformed SC flow, further enhancing sampling accuracy and efficiency. Rigorous theoretical analysis and extensive experimental results substantiate the advantages of our framework.
Incremental Quasi-Newton Methods with Faster Superlinear Convergence Rates
Liu, Zhuanghua, Luo, Luo, Low, Bryan Kian Hsiang
We consider the finite-sum optimization problem, where each component function is strongly convex and has Lipschitz continuous gradient and Hessian. The recently proposed incremental quasi-Newton method is based on BFGS update and achieves a local superlinear convergence rate that is dependent on the condition number of the problem. This paper proposes a more efficient quasi-Newton method by incorporating the symmetric rank-1 update into the incremental framework, which results in the condition-number-free local superlinear convergence rate. Furthermore, we can boost our method by applying the block update on the Hessian approximation, which leads to an even faster local convergence rate. The numerical experiments show the proposed methods significantly outperform the baseline methods.