Goto

Collaborating Authors

 Country


Adversarial Attacks on Probabilistic Autoregressive Forecasting Models

arXiv.org Machine Learning

We develop an effective generation of adversarial attacks on neural models that output a sequence of probability distributions rather than a sequence of single values. This setting includes the recently proposed deep probabilistic autoregressive forecasting models that estimate the probability distribution of a time series given its past and achieve state-of-the-art results in a diverse set of application domains. The key technical challenge we address is effectively differentiating through the Monte-Carlo estimation of statistics of the joint distribution of the output sequence. Additionally, we extend prior work on probabilistic forecasting to the Bayesian setting which allows conditioning on future observations, instead of only on past observations. We demonstrate that our approach can successfully generate attacks with small input perturbations in two challenging tasks where robust decision making is crucial: stock market trading and prediction of electricity consumption.


Security of Distributed Machine Learning: A Game-Theoretic Approach to Design Secure DSVM

arXiv.org Machine Learning

Distributed machine learning algorithms play a significant role in processing massive data sets over large networks. However, the increasing reliance on machine learning on information and communication technologies (ICTs) makes it inherently vulnerable to cyber threats. This work aims to develop secure distributed algorithms to protect the learning from data poisoning and network attacks. We establish a game-theoretic framework to capture the conflicting goals of a learner who uses distributed support vector machines (SVMs) and an attacker who is capable of modifying training data and labels. We develop a fully distributed and iterative algorithm to capture real-time reactions of the learner at each node to adversarial behaviors. The numerical results show that distributed SVM is prone to fail in different types of attacks, and their impact has a strong dependence on the network structure and attack capabilities.


FormulaZero: Distributionally Robust Online Adaptation via Offline Population Synthesis

arXiv.org Machine Learning

Balancing performance and safety is crucial to deploying autonomous vehicles in multi-agent environments. In particular, autonomous racing is a domain that penalizes safe but conservative policies, highlighting the need for robust, adaptive strategies. Current approaches either make simplifying assumptions about other agents or lack robust mechanisms for online adaptation. This work makes algorithmic contributions to both challenges. First, to generate a realistic, diverse set of opponents, we develop a novel method for self-play based on replica-exchange Markov chain Monte Carlo. Second, we propose a distributionally robust bandit optimization procedure that adaptively adjusts risk aversion relative to uncertainty in beliefs about opponents' behaviors. We rigorously quantify the tradeoffs in performance and robustness when approximating these computations in real-time motion-planning, and we demonstrate our methods experimentally on autonomous vehicles that achieve scaled speeds comparable to Formula One racecars.


A working likelihood approach to support vector regression with a data-driven insensitivity parameter

arXiv.org Machine Learning

The insensitive parameter in support vector regression determines the set of support vectors that greatly impacts the prediction. A data-driven approach is proposed to determine an approximate value for this insensitive parameter by minimizing a generalized loss function originating from the likelihood principle. This data-driven support vector regression also statistically standardizes samples using the scale of noises. Nonlinear and linear numerical simulations with three types of noises ($\epsilon$-Laplacian distribution, normal distribution, and uniform distribution), and in addition, five real benchmark data sets, are used to test the capacity of the proposed method. Based on all of the simulations and the five case studies, the proposed support vector regression using a working likelihood, data-driven insensitive parameter is superior and has lower computational costs.


COPT: Coordinated Optimal Transport on Graphs

arXiv.org Machine Learning

We introduce COPT, a novel distance metric between graphs defined via an optimization routine, computing a coordinated pair of optimal transport maps simultaneously. This is an unsupervised way to learn general-purpose graph representations, it can be used for both graph sketching and graph comparison. COPT involves simultaneously optimizing dual transport plans, one between the vertices of two graphs, and another between graph signal probability distributions. We show both theoretically and empirically that our method preserves important global structural information on graphs, in particular spectral information, making it well-suited for tasks on graphs including retrieval, classification, summarization, and visualization.


Nearly Optimal Risk Bounds for Kernel K-Means

arXiv.org Machine Learning

In this paper, we study the statistical properties of the kernel $k$-means and obtain a nearly optimal excess risk bound, substantially improving the state-of-art bounds in the existing clustering risk analyses. We further analyze the statistical effect of computational approximations of the Nystr\"{o}m kernel $k$-means, and demonstrate that it achieves the same statistical accuracy as the exact kernel $k$-means considering only $\sqrt{nk}$ Nystr\"{o}m landmark points. To the best of our knowledge, such sharp excess risk bounds for kernel (or approximate kernel) $k$-means have never been seen before.


Multivariate Boosted Trees and Applications to Forecasting and Control

arXiv.org Machine Learning

Gradient boosted trees are competition-winning, general-purpose, non-parametric regressors, which exploit sequential model fitting and gradient descent to minimize a specific loss function. The most popular implementations are tailored to univariate regression and classification tasks, precluding the possibility of capturing multivariate target cross-correlations and applying conditional penalties to the predictions. In this paper, we present a computationally efficient algorithm for fitting multivariate boosted trees. We show that multivariate trees can outperform their univariate counterpart when the predictions are correlated. Furthermore, the algorithm allows to arbitrarily regularize the predictions, so that properties like smoothness, consistency and functional relations can be enforced. We present applications and numerical results related to forecasting and control.


A Comparative Study on Parameter Estimation in Software Reliability Modeling using Swarm Intelligence

arXiv.org Artificial Intelligence

This work focuses on a comparison between the performances of two well-known Swarm algorithms: Cuckoo Search (CS) and Firefly Algorithm (FA), in estimating the parameters of Software Reliability Growth Models. This study is further reinforced using Particle Swarm Optimization (PSO) and Ant Colony Optimization (ACO). All algorithms are evaluated according to real software failure data, the tests are performed and the obtained results are compared to show the performance of each of the used algorithms. Furthermore, CS and FA are also compared with each other on bases of execution time and iteration number. Experimental results show that CS is more efficient in estimating the parameters of SRGMs, and it has outperformed FA in addition to PSO and ACO for the selected Data sets and employed models.


Neighborhood Information-based Probabilistic Algorithm for Network Disintegration

arXiv.org Artificial Intelligence

Many real-world applications can be modelled as complex networks, and such networks include the Internet, epidemic disease networks, transport networks, power grids, protein-folding structures and others. Network integrity and robustness are important to ensure that crucial networks are protected and undesired harmful networks can be dismantled. Network structure and integrity can be controlled by a set of key nodes, and to find the optimal combination of nodes in a network to ensure network structure and integrity can be an NP-complete problem. Despite extensive studies, existing methods have many limitations and there are still many unresolved problems. This paper presents a probabilistic approach based on neighborhood information and node importance, namely, neighborhood information-based probabilistic algorithm (NIPA). We also define a new centrality-based importance measure (IM), which combines the contribution ratios of the neighbor nodes of each target node and two-hop node information. Our proposed NIPA has been tested for different network benchmarks and compared with three other methods: optimal attack strategy (OAS), high betweenness first (HBF) and high degree first (HDF). Experiments suggest that the proposed NIPA is most effective among all four methods. In general, NIPA can identify the most crucial node combination with higher effectiveness, and the set of optimal key nodes found by our proposed NIPA is much smaller than that by heuristic centrality prediction. In addition, many previously neglected weakly connected nodes are identified, which become a crucial part of the newly identified optimal nodes. Thus, revised strategies for protection are recommended to ensure the safeguard of network integrity. Further key issues and future research topics are also discussed.


Dependently Typed Knowledge Graphs

arXiv.org Artificial Intelligence

Reasoning over knowledge graphs is traditionally built upon a hierarchy of languages in the Semantic Web Stack. Starting from the Resource Description Framework (RDF) for knowledge graphs, more advanced constructs have been introduced through various syntax extensions to add reasoning capabilities to knowledge graphs. In this paper, we show how standardized semantic web technologies (RDF and its query language SPARQL) can be reproduced in a unified manner with dependent type theory. In addition to providing the basic functionalities of knowledge graphs, dependent types add expressiveness in encoding both entities and queries, explainability in answers to queries through witnesses, and compositionality and automation in the construction of witnesses. Using the Coq proof assistant, we demonstrate how to build and query dependently typed knowledge graphs as a proof of concept for future works in this direction.