Goto

Collaborating Authors

 Bayesian Learning


The Application of Machine Learning Techniques for Predicting Results in Team Sport: A Review

arXiv.org Machine Learning

Over the past two decades, Machine Learning (ML) techniques have been increasingly utilized for the purpose of predicting outcomes in sport. In this paper, we provide a review of studies that have used ML for predicting results in team sport, covering studies from 1996 to 2019. We sought to answer five key research questions while extensively surveying papers in this field. This paper offers insights into which ML algorithms have tended to be used in this field, as well as those that are beginning to emerge with successful outcomes. Our research highlights defining characteristics of successful studies and identifies robust strategies for evaluating accuracy results in this application domain. Our study considers accuracies that have been achieved across different sports and explores the notion that outcomes of some team sports could be inherently more difficult to predict than others. Finally, our study uncovers common themes of future research directions across all surveyed papers, looking for gaps and opportunities, while proposing recommendations for future researchers in this domain.


Mining User Behaviour from Smartphone data, a literature review

arXiv.org Machine Learning

To study users' travel behaviour and travel time between origin and destination, researchers employ travel surveys. Although there is consensus in the field about the potential, after over ten years of research and field experimentation, Smartphone-based travel surveys still did not take off to a large scale. Here, computer intelligence algorithms take the role that operators have in Traditional Travel Surveys; since we train each algorithm on data, performances rest on the data quality, thus on the ground truth. Inaccurate validations affect negatively: labels, algorithms' training, travel diaries precision, and therefore data validation, within a very critical loop. Interestingly, boundaries are proven burdensome to push even for Machine Learning methods. To support optimal investment decisions for practitioners, we expose the drivers they should consider when assessing what they need against what they get. This paper highlights and examines the critical aspects of the underlying research and provides some recommendations: (i) from the device perspective, on the main physical limitations; (ii) from the application perspective, the methodological framework deployed for the automatic generation of travel diaries; (iii)from the ground truth perspective, the relationship between user interaction, methods, and data.


Universal Inference Using the Split Likelihood Ratio Test

arXiv.org Machine Learning

We propose a general method for constructing hypothesis tests and confidence sets that have finite sample guarantees without regularity conditions. We refer to such procedures as ``universal.'' The method is very simple and is based on a modified version of the usual likelihood ratio statistic, that we call ``the split likelihood ratio test'' (split LRT). The method is especially appealing for irregular statistical models. Canonical examples include mixture models and models that arise in shape-constrained inference. %mixture models and shape-constrained models are just two examples. Constructing tests and confidence sets for such models is notoriously difficult. Typical inference methods, like the likelihood ratio test, are not useful in these cases because they have intractable limiting distributions. In contrast, the method we suggest works for any parametric model and also for some nonparametric models. The split LRT can also be used with profile likelihoods to deal with nuisance parameters, and it can also be run sequentially to yield anytime-valid $p$-values and confidence sequences.


Attention-Aware Answers of the Crowd

arXiv.org Machine Learning

Crowdsourcing is a relatively economic and efficient solution to collect annotations from the crowd through online platforms. Answers collected from workers with different expertise may be noisy and unreliable, and the quality of annotated data needs to be further maintained. Various solutions have been attempted to obtain high-quality annotations. However, they all assume that workers' label quality is stable over time (always at the same level whenever they conduct the tasks). In practice, workers' attention level changes over time, and the ignorance of which can affect the reliability of the annotations. In this paper, we focus on a novel and realistic crowdsourcing scenario involving attention-aware annotations. We propose a new probabilistic model that takes into account workers' attention to estimate the label quality. Expectation propagation is adopted for efficient Bayesian inference of our model, and a generalized Expectation Maximization algorithm is derived to estimate both the ground truth of all tasks and the label-quality of each individual crowd worker with attention. In addition, the number of tasks best suited for a worker is estimated according to changes in attention. Experiments against related methods on three real-world and one semi-simulated datasets demonstrate that our method quantifies the relationship between workers' attention and label-quality on the given tasks, and improves the aggregated labels.


The Temporal Dynamics of Belief-based Updating of Epistemic Trust: Light at the End of the Tunnel?

arXiv.org Artificial Intelligence

We start with the distinction of outcome- and belief-based Bayesian models of the sequential update of agents' beliefs and subjective reliability of sources (trust). We then focus on discussing the influential Bayesian model of belief-based trust update by Eric Olsson, which models dichotomic events and explicitly represents anti-reliability. After sketching some disastrous recent results for this perhaps most promising model of belief update, we show new simulation results for the temporal dynamics of learning belief with and without trust update and with and without communication. The results seem to shed at least a somewhat more positive light on the communicating-and-trust-updating agents. This may be a light at the end of the tunnel of belief-based models of trust updating, but the interpretation of the clear findings is much less clear.


Sparse Polynomial Chaos expansions using Variational Relevance Vector Machines

arXiv.org Machine Learning

These challenges can be addressed by enforcing sparsity in the series representation through retaining only the most important basis terms. In this work, we present a novel sparse Bayesian learning technique for obtaining sparse Polynomial Chaos expansions which is based on a Relevance Vector Machine model and is trained using Variational Inference. The methodology shows great potential in high-dimensional data-driven settings using relatively few data points and achieves user-controlled sparse levels that are comparable to other methods such as compressive sensing. The proposed approach is illustrated on two numerical examples, a synthetic response function that is explored for validation purposes and a low-carbon steel plate with random Young's modulus and random loading, which is modelled by stochastic finite element with 38 input random variables.


Privacy Attacks on Network Embeddings

arXiv.org Machine Learning

Data ownership and data protection are increasingly important topics with ethical and legal implications, e.g., with the right to erasure established in the European General Data Protection Regulation (GDPR). In this light, we investigate network embeddings, i.e., the representation of network nodes as low-dimensional vectors. We consider a typical social network scenario with nodes representing users and edges relationships between them. We assume that a network embedding of the nodes has been trained. After that, a user demands the removal of his data, requiring the full deletion of the corresponding network information, in particular the corresponding node and incident edges. In that setting, we analyze whether after the removal of the node from the network and the deletion of the vector representation of the respective node in the embedding significant information about the link structure of the removed node is still encoded in the embedding vectors of the remaining nodes. This would require a (potentially computationally expensive) retraining of the embedding. For that purpose, we deploy an attack that leverages information from the remaining network and embedding to recover information about the neighbors of the removed node. The attack is based on (i) measuring distance changes in network embeddings and (ii) a machine learning classifier that is trained on networks that are constructed by removing additional nodes. Our experiments demonstrate that substantial information about the edges of a removed node/user can be retrieved across many different datasets. This implies that to fully protect the privacy of users, node deletion requires complete retraining - or at least a significant modification - of original network embeddings. Our results suggest that deleting the corresponding vector representation from network embeddings alone is not sufficient from a privacy perspective.


Tensor Basis Gaussian Process Models of Hyperelastic Materials

arXiv.org Machine Learning

In this work, we develop Gaussian process regression (GPR) models of hyperelastic material behavior. First, we consider the direct approach of modeling the components of the Cauchy stress tensor as a function of the components of the Finger stretch tensor in a Gaussian process. We then consider an improvement on this approach that embeds rotational invariance of the stress-stretch constitutive relation in the GPR representation. This approach requires fewer training examples and achieves higher accuracy while maintaining invariance to rotations exactly. Finally, we consider an approach that recovers the strain-energy density function and derives the stress tensor from this potential. Although the error of this model for predicting the stress tensor is higher, the strain-energy density is recovered with high accuracy from limited training data. The approaches presented here are examples of physics-informed machine learning. They go beyond purely data-driven approaches by embedding the physical system constraints directly into the Gaussian process representation of materials models.


An improper estimator with optimal excess risk in misspecified density estimation and logistic regression

arXiv.org Machine Learning

We introduce a procedure for predictive conditional density estimation under logarithmic loss, which we call SMP (Sample Minmax Predictor). This predictor minimizes a new general excess risk bound, which critically remains valid under model misspecification. On standard examples, this bound scales as $d/n$ where $d$ is the dimension of the model and $n$ the sample size, regardless of the true distribution. The SMP, which is an improper (out-of-model) procedure, improves over proper (within-model) estimators (such as the maximum likelihood estimator), whose excess risk can degrade arbitrarily in the misspecified case. For density estimation, our bounds improve over approaches based on online-to-batch conversion, by removing suboptimal $\log n$ factors, addressing an open problem from Gr{\"u}nwald and Kot{\l}owski (2011) for the considered models. For the Gaussian linear model, the SMP admits an explicit expression, and its expected excess risk in the general misspecified case is at most twice the minimax excess risk in the \emph{well-specified case}, but without any condition on the noise variance or approximation error of the linear model. For logistic regression, a penalized SMP can be computed efficiently by training two logistic regressions, and achieves a non-asymptotic excess risk of $O((d + B^2R^2)/n)$, where $R$ is a bound on the norm of the features and $B$ the norm of the comparison linear predictor. This improves the rates of proper (within-model) estimators, since such procedures can achieve no better rate than $\min(BR/\sqrt{n},de^{BR}/n)$ in general. This also provides a computationally more efficient alternative to approaches based on online-to-batch conversion of Bayesian mixture procedures, which require approximate posterior sampling, thereby partly answering a question by Foster et al. (2018).


Approximation of Reeb spaces with Mappers and Applications to Stochastic Filters

arXiv.org Machine Learning

Reeb spaces, as well as their discretized versions called Mappers, are common descriptors used in Topological Data Analysis, with plenty of applications in various fields of science, such as computational biology and data visualization, among others. The stability and quantification of the rate of convergence of the Mapper to the Reeb space has been studied a lot in recent works~\cite{Brown2019, Carriere2018a, Carriere2018, Munch2016}, focusing on the case where a scalar-valued filter is used for the computation of Mapper. On the other hand, much less is known in the multivariate case, where the domain of the filter is in $\mathbb R^d$ instead of $\mathbb R$. The only available result in this setting~\cite{Munch2016} only works for topological spaces and cannot be used as is for finite metric spaces representing data, such as point clouds and distance matrices. In this article, we present an approximation result for the Reeb space in the multivariate case using a Mapper-based estimator, which is a slight modification of the usual Mapper construction. Moreover, our approximation is stated with respect to a pseudometric that is an extension of the usual {\em interleaving distance} between persistence modules~\cite{Chazal2016}. Finally, we apply our results to the case where the filter function used to compute the Mapper is estimated from the data. We provide applications of this setting in statistics and machine learning and probability for different kinds of target filters, as well as numerical experiments that demonstrate the relevance of our approach.