Goto

Collaborating Authors

 Supervised Learning


Weighting vectors for machine learning: numerical harmonic analysis applied to boundary detection

arXiv.org Machine Learning

Metric space magnitude, an active field of research in algebraic topology, is a scalar quantity that summarizes the effective number of distinct points that live in a general metric space. The {\em weighting vector} is a closely-related concept that captures, in a nontrivial way, much of the underlying geometry of the original metric space. Recent work has demonstrated that when the metric space is Euclidean, the weighting vector serves as an effective tool for boundary detection. We recast this result and show the weighting vector may be viewed as a solution to a kernelized SVM. As one consequence, we apply this new insight to the task of outlier detection, and we demonstrate performance that is competitive or exceeds performance of state-of-the-art techniques on benchmark data sets. Under mild assumptions, we show the weighting vector, which has computational cost of matrix inversion, can be efficiently approximated in linear time. We show how nearest neighbor methods can approximate solutions to the minimization problems defined by SVMs.


Open-world Machine Learning: Applications, Challenges, and Opportunities

arXiv.org Artificial Intelligence

Traditional machine learning especially supervised learning follows the assumptions of closed-world learning i.e., for each testing class a training class is available. However, such machine learning models fail to identify the classes which were not available during training time. These classes can be referred to as unseen classes. Whereas, open-world machine learning deals with arbitrary inputs (data with unseen classes) to machine learning systems. Moreover, traditional machine learning is static learning which is not appropriate for an active environment where the perspective and sources, and/or volume of data are changing rapidly. In this paper, first, we present an overview of open-world learning with importance to the real-world context. Next, different dimensions of open-world learning are explored and discussed. The area of open-world learning gained the attention of the research community in the last decade only. We have searched through different online digital libraries and scrutinized the work done in the last decade. This paper presents a systematic review of various techniques for open-world machine learning. It also presents the research gaps, challenges, and future directions in open-world learning. This paper will help researchers to understand the comprehensive developments of open-world learning and the likelihoods to extend the research in suitable areas. It will also help to select applicable methodologies and datasets to explore this further.


Learning to Bridge Metric Spaces: Few-shot Joint Learning of Intent Detection and Slot Filling

arXiv.org Artificial Intelligence

In this paper, we investigate few-shot joint learning for dialogue language understanding. Most existing few-shot models learn a single task each time with only a few examples. However, dialogue language understanding contains two closely related tasks, i.e., intent detection and slot filling, and often benefits from jointly learning the two tasks. This calls for new few-shot learning techniques that are able to capture task relations from only a few examples and jointly learn multiple tasks. To achieve this, we propose a similarity-based few-shot learning scheme, named Contrastive Prototype Merging network (ConProm), that learns to bridge metric spaces of intent and slot on data-rich domains, and then adapt the bridged metric space to the specific few-shot domain. Experiments on two public datasets, Snips and FewJoint, show that our model significantly outperforms the strong baselines in one and five shots settings.


A Deep Metric Learning Approach to Account Linking

arXiv.org Artificial Intelligence

We consider the task of linking social media accounts that belong to the same author in an automated fashion on the basis of the content and metadata of their corresponding document streams. We focus on learning an embedding that maps variable-sized samples of user activity -- ranging from single posts to entire months of activity -- to a vector space, where samples by the same author map to nearby points. The approach does not require human-annotated data for training purposes, which allows us to leverage large amounts of social media content. The proposed model outperforms several competitive baselines under a novel evaluation framework modeled after established recognition benchmarks in other domains. Our method achieves high linking accuracy, even with small samples from accounts not seen at training time, a prerequisite for practical applications of the proposed linking framework.


Learning Weakly Convex Sets in Metric Spaces

arXiv.org Artificial Intelligence

We introduce the notion of weak convexity in metric spaces, a generalization of ordinary convexity commonly used in machine learning. It is shown that weakly convex sets can be characterized by a closure operator and have a unique decomposition into a set of pairwise disjoint connected blocks. We give two generic efficient algorithms, an extensional and an intensional one for learning weakly convex concepts and study their formal properties. Our experimental results concerning vertex classification clearly demonstrate the excellent predictive performance of the extensional algorithm. Two non-trivial applications of the intensional algorithm to polynomial PAC-learnability are presented. The first one deals with learning $k$-convex Boolean functions, which are already known to be efficiently PAC-learnable. It is shown how to derive this positive result in a fairly easy way by the generic intensional algorithm. The second one is concerned with the Euclidean space equipped with the Manhattan distance. For this metric space, weakly convex sets are a union of pairwise disjoint axis-aligned hyperrectangles. We show that a weakly convex set that is consistent with a set of examples and contains a minimum number of hyperrectangles can be found in polynomial time. In contrast, this problem is known to be NP-complete if the hyperrectangles may be overlapping.


Approximate Fr\'echet Mean for Data Sets of Sparse Graphs

arXiv.org Machine Learning

To characterize the location (mean, median) of a set of graphs, one needs a notion of centrality that is adapted to metric spaces, since graph sets are not Euclidean spaces. A standard approach is to consider the Fr\'echet mean. In this work, we equip a set of graph with the pseudometric defined by the $\ell_2$ norm between the eigenvalues of their respective adjacency matrix . Unlike the edit distance, this pseudometric reveals structural changes at multiple scales, and is well adapted to studying various statistical problems on sets of graphs. We describe an algorithm to compute an approximation to the Fr\'echet mean of a set of undirected unweighted graphs with a fixed size.


Efficient Non-Sampling Knowledge Graph Embedding

arXiv.org Artificial Intelligence

Knowledge Graph (KG) is a flexible structure that is able to describe the complex relationship between data entities. Currently, most KG embedding models are trained based on negative sampling, i.e., the model aims to maximize some similarity of the connected entities in the KG, while minimizing the similarity of the sampled disconnected entities. Negative sampling helps to reduce the time complexity of model learning by only considering a subset of negative instances, which may fail to deliver stable model performance due to the uncertainty in the sampling procedure. To avoid such deficiency, we propose a new framework for KG embedding -- Efficient Non-Sampling Knowledge Graph Embedding (NS-KGE). The basic idea is to consider all of the negative instances in the KG for model learning, and thus to avoid negative sampling. The framework can be applied to square-loss based knowledge graph embedding models or models whose loss can be converted to a square loss. A natural side-effect of this non-sampling strategy is the increased computational complexity of model learning. To solve the problem, we leverage mathematical derivations to reduce the complexity of non-sampling loss function, which eventually provides us both better efficiency and better accuracy in KG embedding compared with existing models. Experiments on benchmark datasets show that our NS-KGE framework can achieve a better performance on efficiency and accuracy over traditional negative sampling based models, and that the framework is applicable to a large class of knowledge graph embedding models.


Efficient Retrieval of Matrix Factorization-Based Top-k Recommendations: A Survey of Recent Approaches

Journal of Artificial Intelligence Research

Top-k recommendation seeks to deliver a personalized list of k items to each individual user. An established methodology in the literature based on matrix factorization (MF), which usually represents users and items as vectors in low-dimensional space, is an effective approach to recommender systems, thanks to its superior performance in terms of recommendation quality and scalability. A typical matrix factorization recommender system has two main phases: preference elicitation and recommendation retrieval. The former analyzes user-generated data to learn user preferences and item characteristics in the form of latent feature vectors, whereas the latter ranks the candidate items based on the learnt vectors and returns the top-k items from the ranked list. For preference elicitation, there have been numerous works to build accurate MF-based recommendation algorithms that can learn from large datasets. However, for the recommendation retrieval phase, naively scanning a large number of items to identify the few most relevant ones may inhibit truly real-time applications. In this work, we survey recent advances and state-of-the-art approaches in the literature that enable fast and accurate retrieval for MF-based personalized recommendations. Also, we include analytical discussions of approaches along different dimensions to provide the readers with a more comprehensive understanding of the surveyed works.


Aggregation over Metric Spaces: Proposing and Voting in Elections, Budgeting, and Legislation

Journal of Artificial Intelligence Research

We present a unifying framework encompassing a plethora of social choice settings. Viewing each social choice setting as voting in a suitable metric space, we offer a general model of social choice over metric spaces, in which--similarly to the spatial model of elections--each voter specifies an ideal element of the metric space. The ideal element acts as a vote, where each voter prefers elements that are closer to her ideal element. But it also acts as a proposal, thus making all participants equal not only as voters but also as proposers. We consider Condorcet aggregation and a continuum of solution concepts, ranging from minimizing the sum of distances to minimizing the maximum distance. We study applications of our abstract model to various social choice settings, including single-winner elections, committee elections, participatory budgeting, and participatory legislation. For each setting, we compare each solution concept to known voting rules and study various properties of the resulting voting rules. Our framework provides expressive aggregation for a broad range of social choice settings while remaining simple for voters; and may enable a unified and integrated implementation for all these settings, as well as unified extensions such as sybil-resiliency, proxy voting, and deliberative decision making.


Online learning with exponential weights in metric spaces

arXiv.org Machine Learning

The problem of online convex optimization (Cesa-Bianchi and Lugosi, 2006, Shalev-Shwartz, 2012, Hazan, 2016) has become a strandard model of online learning. Its simple and flexible formulation as a repeated game, devoid of distributional assumptions on the data, has proven effective in framing theoretically a number of online prediction tasks including online recommendation systems, online portfolio selection or network routing problems. Traditionally studied in the context of Euclidean spaces, less seems to be known when the decision space is a more general metric space, with potentially no linear structure. In this paper, we extend the analysis of the exponentially weighted average (ewa) forecaster to some geodesic metric spaces. Motivations for this level of generality arise, for example, when the decision space is a smooth manifold. Such a scenario is routinely encountered in directional or shape statistics (Mardia, 1999) where observations take values in spheres, projective spaces or shape spaces.