Goto

Collaborating Authors

 Search


Graph Matching Networks for Learning the Similarity of Graph Structured Objects

arXiv.org Machine Learning

This paper addresses the challenging problem of retrieval and matching of graph structured objects, and makes two key contributions. First, we demonstrate how Graph Neural Networks (GNN), which have emerged as an effective model for various supervised prediction problems defined on structured data, can be trained to produce embedding of graphs in vector spaces that enables efficient similarity reasoning. Second, we propose a novel Graph Matching Network model that, given a pair of graphs as input, computes a similarity score between them by jointly reasoning on the pair through a new cross-graph attention-based matching mechanism. We demonstrate the effectiveness of our models on different domains including the challenging problem of control-flow-graph based function similarity search that plays an important role in the detection of vulnerabilities in software systems. The experimental analysis demonstrates that our models are not only able to exploit structure in the context of similarity learning but they can also outperform domain-specific baseline systems that have been carefully hand-engineered for these problems.


Nonparametric feature extraction based on Minimax distance

arXiv.org Artificial Intelligence

We investigate the use of Minimax distances to extract in a nonparametric way the features that capture the unknown underlying patterns and structures in the data. We develop a general-purpose framework to employ Minimax distances with many machine learning methods that perform on numerical data. For this purpose, first, we compute the pairwise Minimax distances between the objects, using the equivalence of Minimax distances over a graph and over a minimum spanning tree constructed on that. Then, we perform an embedding of the pairwise Minimax distances into a new vector space, such that their squared Euclidean distances in the new space equal to the pairwise Minimax distances in the original space. In the following, we study the case of having multiple pairwise Minimax matrices, instead of a single one. Thereby, we propose an embedding via first summing up the centered matrices and then performing an eigenvalue decomposition. Finally, we perform several experimental studies to illustrate the effectiveness of our framework.


Survey on Automated Machine Learning

arXiv.org Artificial Intelligence

Machine learning has become a vital part in many aspects of our daily life. However, building well performing machine learning applications requires highly specialized data scientists and domain experts. Automated machine learning (AutoML) aims to reduce the demand for data scientists by enabling domain experts to automatically build machine learning applications without extensive knowledge of statistics and machine learning. In this survey, we summarize the recent developments in academy and industry regarding AutoML. First, we introduce a holistic problem formulation. Next, approaches for solving various subproblems of AutoML are presented. Finally, we provide an extensive empirical evaluation of the presented approaches on synthetic and real data.


AutoKGE: Searching Scoring Functions for Knowledge Graph Embedding

arXiv.org Machine Learning

Knowledge graph embedding (KGE) aims to find low dimensional vector representations of entities and relations so that their similarities can be quantized. Scoring functions (SFs), which are used to build a model to measure the similarity between entities based on a given relation, have developed as the crux of KGE. Humans have designed lots of SFs in the literature, and the evolving of SF has become the primary power source of boosting KGE's performance. However, such improvements gradually get marginal. Besides, with so many SFs, how to make a proper choice among existing SFs already becomes a non-trivial problem. Inspired by the recent success of automated machine learning (AutoML), in this paper, we propose automated KGE (AutoKGE), to design and discover distinct SFs for KGE automatically. We first identify a unified representation over popularly used SFs, which helps to set up a search space for AutoKGE. Then, we propose a greedy algorithm, which is enhanced by a predictor to estimate the final performance without model training, to search through the space. Extensive experiments on benchmark datasets demonstrate the effectiveness and efficiency of our AutoKGE. Finally, the SFs, searched by our method, are KG dependent, new to the literature, and outperform existing state-of-the-arts SFs designed by humans.


Knowing When to Stop: Evaluation and Verification of Conformity to Output-size Specifications

arXiv.org Artificial Intelligence

Models such as Sequence-to-Sequence and Image-to-Sequence are widely used in real world applications. While the ability of these neural architectures to produce variable-length outputs makes them extremely effective for problems like Machine Translation and Image Captioning, it also leaves them vulnerable to failures of the form where the model produces outputs of undesirable length. This behavior can have severe consequences such as usage of increased computation and induce faults in downstream modules that expect outputs of a certain length. Motivated by the need to have a better understanding of the failures of these models, this paper proposes and studies the novel output-size modulation problem and makes two key technical contributions. First, to evaluate model robustness, we develop an easy-to-compute differentiable proxy objective that can be used with gradient-based algorithms to find output-lengthening inputs. Second and more importantly, we develop a verification approach that can formally verify whether a network always produces outputs within a certain length. Experimental results on Machine Translation and Image Captioning show that our output-lengthening approach can produce outputs that are 50 times longer than the input, while our verification approach can, given a model and input domain, prove that the output length is below a certain size.


A Novel Orthogonal Direction Mesh Adaptive Direct Search Approach for SVM Hyperparameter Tuning

arXiv.org Machine Learning

In this paper, we propose the use of a black-box optimization method called deterministic Mesh Adaptive Direct Search (MADS) algorithm with orthogonal directions (Ortho-MADS) for the selection of hyperparameters of Support Vector Machines with a Gaussian kernel. Different from most of the methods in the literature that exploit the properties of the data or attempt to minimize the accuracy of a validation dataset over the first quadrant of (C, gamma), the Ortho-MADS provides convergence proof. We present the MADS, followed by the Ortho-MADS, the dynamic stopping criterion defined by the MADS mesh size and two different search strategies (Nelder-Mead and Variable Neighborhood Search) that contribute to a competitive convergence rate as well as a mechanism to escape from undesired local minima. We have investigated the practical selection of hyperparameters for the Support Vector Machine with a Gaussian kernel, i.e., properly choose the hyperparameters gamma (bandwidth) and C (trade-off) on several benchmark datasets. The experimental results have shown that the proposed approach for hyperparameter tuning consistently finds comparable or better solutions, when using a common configuration, than other methods. We have also evaluated the accuracy and the number of function evaluations of the Ortho-MADS with the Nelder-Mead search strategy and the Variable Neighborhood Search strategy using the mesh size as a stopping criterion, and we have achieved accuracy that no other method for hyperparameters optimization could reach.


Reducing The Search Space For Hyperparameter Optimization Using Group Sparsity

arXiv.org Machine Learning

We propose a new algorithm for hyperparameter selection in machine learning algorithms. The algorithm is a novel modification of Harmonica, a spectral hyperparameter selection approach using sparse recovery methods. In particular, we show that a special encoding of hyperparameter space enables a natural group-sparse recovery formulation, which when coupled with HyperBand (a multi-armed bandit strategy) leads to improvement over existing hyperparameter optimization methods such as Successive Halving and Random Search. Experimental results on image datasets such as CIFAR-10 confirm the benefits of our approach.


Lipschitz Bandit Optimization with Improved Efficiency

arXiv.org Artificial Intelligence

We consider the Lipschitz bandit optimization problem with an emphasis on practical efficiency. Although there is rich literature on regret analysis of this type of problem, e.g., [Kleinberg et al. 2008, Bubeck et al. 2011, Slivkins 2014], their proposed algorithms suffer from serious practical problems including extreme time complexity and dependence on oracle implementations. With this motivation, we propose a novel algorithm with an Upper Confidence Bound (UCB) exploration, namely Tree UCB-Hoeffding, using adaptive partitions. Our partitioning scheme is easy to implement and does not require any oracle settings. With a tree-based search strategy, the total computational cost can be improved to $\mathcal{O}(T\log T)$ for the first $T$ iterations. In addition, our algorithm achieves the regret lower bound up to a logarithmic factor.


The Commute Trip Sharing Problem

arXiv.org Artificial Intelligence

Parking pressure has been steadily increasing in cities as well as in university and corporate campuses. To relieve this pressure, this paper studies a car-pooling platform that would match riders and drivers, while guaranteeing a ride back and exploiting spatial and temporal locality. In particular, the paper formalizes the Commute Trip Sharing Problem (CTSP) to find a routing plan that maximizes ride sharing for a set of commute trips. The CTSP is a generalization of the vehicle routing problem with routes that satisfy time window, capacity, pairing, precedence, ride duration, and driver constraints. The paper introduces two exact algorithms for the CTPS: A route-enumeration algorithm and a branch-and-price algorithm. Experimental results show that, on a high-fidelity, real-world dataset of commute trips from a mid-size city, both algorithms optimally solve small and medium-sized problems and produce high-quality solutions for larger problem instances. The results show that car pooling, if widely adopted, has the potential to reduce vehicle usage by up to 57% and decrease vehicle miles traveled by up to 46% while only incurring a 22% increase in average ride time per commuter for the trips considered.


Integer Programming for Learning Directed Acyclic Graphs from Continuous Data

arXiv.org Machine Learning

Learning directed acyclic graphs (DAGs) from data is a challenging task both in theory and in practice, because the number of possible DAGs scales superexponentially with the number of nodes. In this paper, we study the problem of learning an optimal DAG from continuous observational data. We cast this problem in the form of a mathematical programming model which can naturally incorporate a super-structure in order to reduce the set of possible candidate DAGs. We use the penalized negative log-likelihood score function with both $\ell_0$ and $\ell_1$ regularizations and propose a new mixed-integer quadratic optimization (MIQO) model, referred to as a layered network (LN) formulation. The LN formulation is a compact model, which enjoys as tight an optimal continuous relaxation value as the stronger but larger formulations under a mild condition. Computational results indicate that the proposed formulation outperforms existing mathematical formulations and scales better than available algorithms that can solve the same problem with only $\ell_1$ regularization. In particular, the LN formulation clearly outperforms existing methods in terms of computational time needed to find an optimal DAG in the presence of a sparse super-structure.