Goto

Collaborating Authors

 Country


Theoretical and Experimental Analyses of Tensor-Based Regression and Classification

arXiv.org Machine Learning

We theoretically and experimentally investigate tensor-based regression and classification. Our focus is regularization with various tensor norms, including the overlapped trace norm, the latent trace norm, and the scaled latent trace norm. We first give dual optimization methods using the alternating direction method of multipliers, which is computationally efficient when the number of training samples is moderate. We then theoretically derive an excess risk bound for each tensor norm and clarify their behavior. Finally, we perform extensive experiments using simulated and real data and demonstrate the superiority of tensor-based learning methods over vector- and matrix-based learning methods.


Simultaneous Clustering and Model Selection for Multinomial Distribution: A Comparative Study

arXiv.org Machine Learning

In this paper, we study different discrete data clustering methods, which use the Model-Based Clustering (MBC) framework with the Multinomial distribution. Our study comprises several relevant issues, such as initialization, model estimation and model selection. Additionally, we propose a novel MBC method by efficiently combining the partitional and hierarchical clustering techniques. We conduct experiments on both synthetic and real data and evaluate the methods using accuracy, stability and computation time. Our study identifies appropriate strategies to be used for discrete data analysis with the MBC methods. Moreover, our proposed method is very competitive w.r.t. clustering accuracy and better w.r.t. stability and computation time.


On Graphical Models via Univariate Exponential Family Distributions

arXiv.org Machine Learning

Undirected graphical models, or Markov networks, are a popular class of statistical models, used in a wide variety of applications. Popular instances of this class include Gaussian graphical models and Ising models. In many settings, however, it might not be clear which subclass of graphical models to use, particularly for non-Gaussian and non-categorical data. In this paper, we consider a general sub-class of graphical models where the node-wise conditional distributions arise from exponential families. This allows us to derive multivariate graphical model distributions from univariate exponential family distributions, such as the Poisson, negative binomial, and exponential distributions. Our key contributions include a class of M-estimators to fit these graphical model distributions; and rigorous statistical analysis showing that these M-estimators recover the true graphical model structure exactly, with high probability. We provide examples of genomic and proteomic networks learned via instances of our class of graphical models derived from Poisson and exponential distributions.


Ascribing Consciousness to Artificial Intelligence

arXiv.org Artificial Intelligence

This paper critically assesses the anti-functionalist stance on consciousness adopted by certain advocates of integrated information theory (IIT), a corollary of which is that human-level artificial intelligence implemented on conventional computing hardware is necessarily not conscious. The critique draws on variations of a well-known gradual neuronal replacement thought experiment, as well as bringing out tensions in IIT's treatment of self-knowledge. The aim, though, is neither to reject IIT outright nor to champion functionalism in particular. Rather, it is suggested that both ideas have something to offer a scientific understanding of consciousness, as long as they are not dressed up as solutions to illusory metaphysical problems. As for human-level AI, we must await its development before we can decide whether or not to ascribe consciousness to it.


Particle approximations of the score and observed information matrix for parameter estimation in state space models with linear computational cost

arXiv.org Machine Learning

Poyiadjis et al. (2011) show how particle methods can be used to estimate both the score and the observed information matrix for state space models. These methods either suffer from a computational cost that is quadratic in the number of particles, or produce estimates whose variance increases quadratically with the amount of data. This paper introduces an alternative approach for estimating these terms at a computational cost that is linear in the number of particles. The method is derived using a combination of kernel density estimation, to avoid the particle degeneracy that causes the quadratically increasing variance, and Rao-Blackwellisation. Crucially, we show the method is robust to the choice of bandwidth within the kernel density estimation, as it has good asymptotic properties regardless of this choice. Our estimates of the score and observed information matrix can be used within both online and batch procedures for estimating parameters for state space models. Empirical results show improved parameter estimates compared to existing methods at a significantly reduced computational cost. Supplementary materials including code are available.


Predicting SLA Violations in Real Time using Online Machine Learning

arXiv.org Machine Learning

Detecting faults and SLA violations in a timely manner is critical for telecom providers, in order to avoid loss in business, revenue and reputation. At the same time predicting SLA violations for user services in telecom environments is difficult, due to time-varying user demands and infrastructure load conditions. In this paper, we propose a service-agnostic online learning approach, whereby the behavior of the system is learned on the fly, in order to predict client-side SLA violations. The approach uses device-level metrics, which are collected in a streaming fashion on the server side. Our results show that the approach can produce highly accurate predictions (>90% classification accuracy and < 10% false alarm rate) in scenarios where SLA violations are predicted for a video-on-demand service under changing load patterns. The paper also highlight the limitations of traditional offline learning methods, which perform significantly worse in many of the considered scenarios.


Quantization based Fast Inner Product Search

arXiv.org Machine Learning

We propose a quantization based approach for fast approximate Maximum Inner Product Search (MIPS). Each database vector is quantized in multiple subspaces via a set of codebooks, learned directly by minimizing the inner product quantization error. Then, the inner product of a query to a database vector is approximated as the sum of inner products with the subspace quantizers. Different from recently proposed LSH approaches to MIPS, the database vectors and queries do not need to be augmented in a higher dimensional feature space. We also provide a theoretical analysis of the proposed approach, consisting of the concentration results under mild assumptions. Furthermore, if a small sample of example queries is given at the training time, we propose a modified codebook learning procedure which further improves the accuracy. Experimental results on a variety of datasets including those arising from deep neural networks show that the proposed approach significantly outperforms the existing state-of-the-art.


Non-normal modalities in variants of Linear Logic

arXiv.org Artificial Intelligence

This article presents modal versions of resource-conscious logics. We concentrate on extensions of variants of Linear Logic with one minimal non-normal modality. In earlier work, where we investigated agency in multi-agent systems, we have shown that the results scale up to logics with multiple non-minimal modalities. Here, we start with the language of propositional intuitionistic Linear Logic without the additive disjunction, to which we add a modality. We provide an interpretation of this language on a class of Kripke resource models extended with a neighbourhood function: modal Kripke resource models. We propose a Hilbert-style axiomatization and a Gentzen-style sequent calculus. We show that the proof theories are sound and complete with respect to the class of modal Kripke resource models. We show that the sequent calculus admits cut elimination and that proof-search is in PSPACE. We then show how to extend the results when non-commutative connectives are added to the language. Finally, we put the logical framework to use by instantiating it as logics of agency. In particular, we propose a logic to reason about the resource-sensitive use of artefacts and illustrate it with a variety of examples.


Community Detection in Networks with Node Features

arXiv.org Machine Learning

Many methods have been proposed for community detection in networks, but most of them do not take into account additional information on the nodes that is often available in practice. In this paper, we propose a new joint community detection criterion that uses both the network edge information and the node features to detect community structures. One advantage our method has over existing joint detection approaches is the flexibility of learning the impact of different features which may differ across communities. Another advantage is the flexibility of choosing the amount of influence the feature information has on communities. The method is asymptotically consistent under the block model with additional assumptions on the feature distributions, and performs well on simulated and real networks.


Semi-described and semi-supervised learning with Gaussian processes

arXiv.org Machine Learning

Propagating input uncertainty through non-linear Gaussian process (GP) mappings is intractable. This hinders the task of training GPs using uncertain and partially observed inputs. In this paper we refer to this task as "semi-described learning". We then introduce a GP framework that solves both, the semi-described and the semi-supervised learning problems (where missing values occur in the outputs). Auto-regressive state space simulation is also recognised as a special case of semi-described learning. To achieve our goal we develop variational methods for handling semi-described inputs in GPs, and couple them with algorithms that allow for imputing the missing values while treating the uncertainty in a principled, Bayesian manner. Extensive experiments on simulated and real-world data study the problems of iterative forecasting and regression/classification with missing values. The results suggest that the principled propagation of uncertainty stemming from our framework can significantly improve performance in these tasks.