Support Vector Machines
An Introduction to MM Algorithms for Machine Learning and Statistical
MM (majorization--minimization) algorithms are an increasingly popular tool for solving optimization problems in machine learning and statistical estimation. This article introduces the MM algorithm framework in general and via three popular example applications: Gaussian mixture regressions, multinomial logistic regressions, and support vector machines. Specific algorithms for the three examples are derived and numerical demonstrations are presented. Theoretical and practical aspects of MM algorithm design are discussed.
A Subsequence Interleaving Model for Sequential Pattern Mining
Fowkes, Jaroslav, Sutton, Charles
Recent sequential pattern mining methods have used the minimum description length (MDL) principle to define an encoding scheme which describes an algorithm for mining the most compressing patterns in a database. We present a novel subsequence interleaving model based on a probabilistic model of the sequence database, which allows us to search for the most compressing set of patterns without designing a specific encoding scheme. Our proposed algorithm is able to efficiently mine the most relevant sequential patterns and rank them using an associated measure of interestingness. The efficient inference in our model is a direct result of our use of a structural expectation-maximization framework, in which the expectation-step takes the form of a submodular optimization problem subject to a coverage constraint. We show on both synthetic and real world datasets that our model mines a set of sequential patterns with low spuriousness and redundancy, high interpretability and usefulness in real-world applications. Furthermore, we demonstrate that the quality of the patterns from our approach is comparable to, if not better than, existing state of the art sequential pattern mining algorithms.
Why is it Difficult to Detect Sudden and Unexpected Epidemic Outbreaks in Twitter?
Stewart, Avarรฉ, Romano, Sara, Kanhabua, Nattiya, Di Martino, Sergio, Siberski, Wolf, Mazzeo, Antonino, Nejdl, Wolfgang, Diaz-Aviles, Ernesto
Social media services such as Twitter are a valuable source of information for decision support systems. Many studies have shown that this also holds for the medical domain, where Twitter is considered a viable tool for public health officials to sift through relevant information for the early detection, management, and control of epidemic outbreaks. This is possible due to the inherent capability of social media services to transmit information faster than traditional channels. However, the majority of current studies have limited their scope to the detection of common and seasonal health recurring events (e.g., Influenza-like Illness), partially due to the noisy nature of Twitter data, which makes outbreak detection and management very challenging. Within the European project M-Eco, we developed a Twitter-based Epidemic Intelligence (EI) system, which is designed to also handle a more general class of unexpected and aperiodic outbreaks. In particular, we faced three main research challenges in this endeavor: 1) dynamic classification to manage terminology evolution of Twitter messages, 2) alert generation to produce reliable outbreak alerts analyzing the (noisy) tweet time series, and 3) ranking and recommendation to support domain experts for better assessment of the generated alerts. In this paper, we empirically evaluate our proposed approach to these challenges using real-world outbreak datasets and a large collection of tweets. We validate our solution with domain experts, describe our experiences, and give a more realistic view on the benefits and issues of analyzing social media for public health.
Will AI replace judges and lawyers?
An artificial intelligence method developed by University College London computer scientists and associates has predicted the judicial decisions of the European Court of Human Rights (ECtHR) with 79% accuracy, according to a paper published Monday, Oct. 24 in PeerJ Computer Science. The method is the first to predict the outcomes of a major international court by automatically analyzing case text using a machine-learning algorithm.* "We don't see AI replacing judges or lawyers," said Nikolaos Aletras, who led the study at UCL Computer Science, "but we think they'd find it useful for rapidly identifying patterns in cases that lead to certain outcomes. It could also be a valuable tool for highlighting which cases are most likely to be violations of the European Convention on Human Rights." In developing the method, the team found that judgments by the ECtHR are highly correlated to non-legal (real-world) facts, rather than direct legal arguments, suggesting that judges of the Court are, in the jargon of legal theory, "realists" rather than "formalists."
How to choose algorithms for Microsoft Azure Machine Learning
The answer to the question "What machine learning algorithm should I use?" is always "It depends." It depends on the size, quality, and nature of the data. It depends what you want to do with the answer. It depends on how the math of the algorithm was translated into instructions for the computer you are using. And it depends on how much time you have. Even the most experienced data scientists can't tell which algorithm will perform best before trying them. The Microsoft Azure Machine Learning Algorithm Cheat Sheet helps you choose the right machine learning algorithm for your predictive analytics solutions from the Microsoft Azure Machine Learning library of algorithms.
Mastering Machine Learning With scikit-learn
If you are a software developer who wants to learn how machine learning models work and how to apply them effectively, this book is for you. Familiarity with machine learning fundamentals and Python will be helpful, but is not essential. This book examines machine learning models including logistic regression, decision trees, and support vector machines, and applies them to common problems such as categorizing documents and classifying images. It begins with the fundamentals of machine learning, introducing you to the supervised-unsupervised spectrum, the uses of training and test data, and evaluating models. You will learn how to use generalized linear models in regression problems, as well as solve problems with text and categorical features. You will be acquainted with the use of logistic regression, regularization, and the various loss functions that are used by generalized linear models.
Predicting the Higgs-Boson Signal
The Higgs Boson is a landmark discovery that will help us to understand the basic nature of the universe. It was discovered first by the ATLAS experiment at the Large Hadron Collider, CERN in 2012. The Higg's Boson decays into two tau particles giving rise to a small signal buried in background noise. The goal of the Higgs Boson Machine Learning Challenge was to classify the characterizing events detected by ATLAS into "tau tau decay of a Higgs boson" versus "background." First step was to analyze the data and look for Missingness in the data. We found that the missing columns have some interesting pattern and they depend on the columns "PRI_jet_column", which is the number of jets having integer values of 0,1,2, or 3 where larger values has been caped at 3. The Jets are the experimental signatures of quarks and gluons produced in high-energy processes such as head-on proton-proton collisions. For PRI_jet_column 0, there were 10 columns having NULL values (-999), these are the columns which describe the Jet when it is equal to 0. For example, "DER_mass_jet_jet", the invariant mass (20) of the two jets (undefined if PRI jet num 1).So, it does not make sense to take into account the attributes of the jet(s), since they don't exist. For "PRI_jet_column" 1, there were 7 columns having NULL values and they describe the jets when their number is 2, So we deleted these 7 columns. For "PRI_jet_column" 2 or 3, we did not delete any columns.
Predicting judicial decisions of the European Court of Human Rights: a Natural Language Processing perspective
In his prescient work on investigating the potential use of information technology in the legal domain, Lawlor surmised that computers would one day become able to analyse and predict the outcomes of judicial decisions (Lawlor, 1963). According to Lawlor, reliable prediction of the activity of judges would depend on a scientific understanding of the ways that the law and the facts impact on the relevant decision-makers, i.e., the judges. More than fifty years later, the advances in Natural Language Processing (NLP) and Machine Learning (ML) provide us with the tools to automatically analyse legal materials, so as to build successful predictive models of judicial outcomes. In this paper, our particular focus is on the automatic analysis of cases of the European Court of Human Rights (ECtHR or Court). The ECtHR is an international court that rules on individual or, much more rarely, State applications alleging violations by some State Party of the civil and political rights set out in the European Convention on Human Rights (ECHR or Convention).
Not robocop, but robojudge? AI learns to rule in human rights cases
An artificial intelligence system designed to predict the outcomes of cases at the European Court of Human Rights would side with the human judges 79 percent of the time. Researchers at University College London and the University of Sheffield in the U.K., and the University of Pennsylvania in the U.S., described the system in a paper published Monday by the Peer Journal of Computer Science. "We formulated a binary classification task where the input of our classifiers is the textual content extracted from a case and the target output is the actual judgment as to whether there has been a violation of an article of the convention of human rights," wrote the paper's authors, Nikolaos Aletras, Dimitrios Tsarapatsanis, Daniel Preo?iuc-Pietro and Vasileios Lampos. The system examined public court documents relating to 584 cases of violations of articles 3 (prohibiting torture), 6 (right to a fair trial) and 8 (respect for private life) of the European Convention on Human Rights, which has been ratified by 47 European countries. The court documents have a distinctive structure, discussing first the procedure by which the case reached the court, the facts and circumstances of the case, relevant law, and the legal arguments applied.
Robust training on approximated minimal-entropy set
Xie, Tianpei, Narabadi, Nasser. M., Hero, Alfred O.
Large margin classifiers, such as the support vector machine (SVM) [1] and the maximum entropy discrimination (MED) classifier [2], have enjoyed great popularity in the signal processing and machine learning communities due to their broad applicability, robust performance, and the availability of fast software implementations. When the training data is representative of the test data, the performance of MED/SVM has theoretical guarantees that have been validated in practice [1], [3], [4]. Moreover, since the decision boundary of the MED/SVM is solely defined by a few support vectors, the algorithm can tolerate random feature distortions and perturbations. However, in many real applications, anomalous measurements are inherent to the data set due to strong environmental noise or possible sensor failures. Such anomalies arise in industrial process monitoring, video surveillance, tactical multimodal sensing, robust spectrum sensing [5], [6], and, more generally, any application that involves unattended sensors in difficult environments (Figure 1).