Goto

Collaborating Authors

 Accuracy


Scan $B$-Statistic for Kernel Change-Point Detection

arXiv.org Machine Learning

Detecting the emergence of an abrupt change-point is a classic problem in statistics and machine learning. Kernel-based nonparametric statistics have been used for this task which enjoy fewer assumptions on the distributions than the parametric approach and can handle high-dimensional data. In this paper we focus on the scenario when the amount of background data is large, and propose two related computationally efficient kernel-based statistics for change-point detection, which are inspired by the recently developed $B$-statistics. A novel theoretical result of the paper is the characterization of the tail probability of these statistics using the change-of-measure technique, which focuses on characterizing the tail of the detection statistics rather than obtaining its asymptotic distribution under the null distribution. Such approximations are crucial to control the false alarm rate, which corresponds to the significance level in offline change-point detection and the average-run-length in online change-point detection. Our approximations are shown to be highly accurate. Thus, they provide a convenient way to find detection thresholds for both offline and online cases without the need to resort to the more expensive simulations or bootstrapping. We show that our methods perform well on both synthetic data and real data.


Improving your statistical inferences Coursera

@machinelearnbot

About this course: This course aims to help you to draw better statistical inferences from empirical research. First, we will discuss how to correctly interpret p-values, effect sizes, confidence intervals, Bayes Factors, and likelihood ratios, and how these statistics answer different questions you might be interested in. Then, you will learn how to design experiments where the false positive rate is controlled, and how to decide upon the sample size for your study, for example in order to achieve high statistical power. Subsequently, you will learn how to interpret evidence in the scientific literature given widespread publication bias, for example by learning about p-curve analysis. Finally, we will talk about how to do philosophy of science, theory construction, and cumulative science, including how to perform replication studies, why and how to pre-register your experiment, and how to share your results following Open Science principles.


Lift (data mining) - Wikipedia

#artificialintelligence

In data mining and association rule learning, lift is a measure of the performance of a targeting model (association rule) at predicting or classifying cases as having an enhanced response (with respect to the population as a whole), measured against a random choice targeting model. A targeting model is doing a good job if the response within the target is much better than the average for the population as a whole. Lift is simply the ratio of these values: target response divided by average response. For example, suppose a population has an average response rate of 5%, but a certain model (or rule) has identified a segment with a response rate of 20%. Then that segment would have a lift of 4.0 (20%/5%).


Editor's picks: The many applications of Machine Learning in banking

#artificialintelligence

Can robots and data stop banks terror financing? Buying a new printer from ISIS is probably not how many people envision their stationary shopping to proceed. But it was only a month ago that the FBI announced that it had found a senior Islamic State (ISIS) official sent money to an alleged operative based in the US via a global financial network that used fake eBay sales to mask payments. This is a timely reminder about how vulnerable businesses can be to terrorist financing. Gurjeet Singh, co-founder and Executive Chairman of Ayasdi, spoke to bobsguide about the challenges of compliance with anti-money laundering, the characteristics of AI, and how AI is vastly improving false positive rates on suspicious security reports. It's a Catch 22: to get financial credit, you need a credit history; to get a credit history, someone has to give you credit.


Global Weisfeiler-Lehman Graph Kernels

arXiv.org Machine Learning

Most state-of-the-art graph kernels only take local graph properties into account, i.e., the kernel is computed with regard to properties of the neighborhood of vertices or other small substructures. On the other hand, kernels that do take global graph propertiesinto account may not scale well to large graph databases. Here we propose to start exploring the space between local and global graph kernels, striking the balance between both worlds. Specifically, we introduce a novel graph kernel based on the $k$-dimensional Weisfeiler-Lehman algorithm. Unfortunately, the $k$-dimensional Weisfeiler-Lehman algorithm scales exponentially in $k$. Consequently, we devise a stochastic version of the kernel with provable approximation guarantees using conditional Rademacher averages. On bounded-degree graphs, it can even be computed in constant time. We support our theoretical results with experiments on several graph classification benchmarks, showing that our kernels often outperform the state-of-the-art in terms of classification accuracies.


Data Science Simplified Part 10: An Introduction to Classification Models

@machinelearnbot

The world around is full of classifiers. Classifiers help in preventing spam e-mails. Classifiers help in identifying customers who may churn. Classifiers help in predicting whether it will rain or not. This supervised learning method is ubiquitous in business applications.


The generalised random dot product graph

arXiv.org Machine Learning

Because they appear in virtually every facet of the digital world, there is considerable value in being able to make inference and predictions based on networks. In Statistics, such endeavours often start with a probability model, mapping unknown quantities of interest to the data, and, here, one is proposed which strikes a promising balance of generality and interpretability. Our focus is on the simplest case of modelling a graph, that is, a set of nodes and (undirected) edges. To start discussions, we consider first the benefits and drawbacks of a foundational model known as the stochastic blockmodel (Holland et al., 1983). In this model, the nodes of the graph can be grouped into k communities, such that the probability of two nodes forming an edge is dependent only on the two communities involved, and is given by a k k inter-community edge probability matrix B. Under basic exchangeability assumptions (Aldous, 1981; Hoover, 1979), the model can be regarded as providing a piecewise constant, or even histogram (Olhede and Wolfe, 2014), approximation to any random graph model satisfying basic exchangeability assumptions (Aldous, 1981; Hoover, 1979). Its generality yet simple interpretation make it a natural candidate for exploratory data analysis and the model is very popular in practice. However, one obvious issue is its discrete structure, in particular, the'hard' assignment of every node to a single community. We would often prefer to describe node behaviour in a more continuous way. In a seminal paper, Hoff et al. (2002) considered a number of latent position models where, in abstract terms, each node i is mapped to a point X


Evaluating Data Science Projects: A Case Study Critique

@machinelearnbot

I've written two blog posts on evaluation--the broccoli of machine learning. Both types are important not only to data scientists but also to managers and executives, who must evaluate project proposals and results. To managers I would say: It's not necessary to understand the inner workings of a machine learning project, but you should understand whether the right things have been measured and whether the results are suited to the business problem. You need to know whether to believe what data scientists are telling you. To this end, here I'll evaluate a machine learning project report.


Text Compression for Sentiment Analysis via Evolutionary Algorithms

arXiv.org Machine Learning

Can textual data be compressed intelligently without losing accuracy in evaluating sentiment? In this study, we propose a novel evolutionary compression algorithm, PARSEC (PARts-of-Speech for sEntiment Compression), which makes use of Parts-of-Speech tags to compress text in a way that sacrifices minimal classification accuracy when used in conjunction with sentiment analysis algorithms. An analysis of PARSEC with eight commercial and non-commercial sentiment analysis algorithms on twelve English sentiment data sets reveals that accurate compression is possible with (0%, 1.3%, 3.3%) loss in sentiment classification accuracy for (20%, 50%, 75%) data compression with PARSEC using LingPipe, the most accurate of the sentiment algorithms. Other sentiment analysis algorithms are more severely affected by compression. We conclude that significant compression of text data is possible for sentiment analysis depending on the accuracy demands of the specific application and the specific sentiment analysis algorithm used.


WWE No Mercy 2017: Predictions, Match Card For 'Monday Night Raw' PPV

International Business Times

It's hard to remember a non-WrestleMania or SummerSlam pay-per-view that had two bigger matches than the ones headlining WWE No Mercy 2017 Sunday night. The card features Brock Lesnar vs. Braun Strowman and John Cena vs. Roman Reigns, both of which are WrestleMania-worthy matches. Below are predictions for every match on the WWE No Mercy card, which features wrestlers from "Monday Night Raw." It's time to put the strap on Strowman. Sure, he's gotten a big push by WWE, but his rise to the top of the card has also been an organic one. During a year in which every three-hour "Monday Night Raw" hasn't exactly been worth watching, Strowman has consistently been the best part of the show, going from a monster heel into maybe the most popular wrestler on the roster.