Goto

Collaborating Authors

 Support Vector Machines


Enumeration of Distinct Support Vectors for Interactive Decision Making

arXiv.org Machine Learning

In conventional prediction tasks, a machine learning algorithm outputs a single best model that globally optimizes its objective function, which typically is accuracy. Therefore, users cannot access the other models explicitly. In contrast to this, multiple model enumeration attracts increasing interests in non-standard machine learning applications where other criteria, e.g., interpretability or fairness, than accuracy are main concern and a user may want to access more than one non-optimal, but suitable models. In this paper, we propose a K-best model enumeration algorithm for Support Vector Machines (SVM) that given a dataset S and an integer K>0, enumerates the K-best models on S with distinct support vectors in the descending order of the objective function values in the dual SVM problem. Based on analysis of the lattice structure of support vectors, our algorithm efficiently finds the next best model with small latency. This is useful in supporting users's interactive examination of their requirements on enumerated models. By experiments on real datasets, we evaluated the efficiency and usefulness of our algorithm.


Confidence Regions in Wasserstein Distributionally Robust Estimation

arXiv.org Machine Learning

Wasserstein distributionally robust optimization (DRO) estimators are obtained as solutions of min-max problems in which the statistician selects a parameter minimizing the worst-case loss among all probability models within a certain distance (in a Wasserstein sense) from the underlying empirical measure. While motivated by the need to identify model parameters (or) decision choices that are robust to model uncertainties and misspecification, the Wasserstein DRO estimators recover a wide range of regularized estimators, including square-root LASSO and support vector machines, among others, as particular cases. This paper studies the asymptotic normality of underlying DRO estimators as well as the properties of an optimal (in a suitable sense) confidence region induced by the Wasserstein DRO formulation.


On the Correctness and Sample Complexity of Inverse Reinforcement Learning

arXiv.org Machine Learning

Inverse reinforcement learning (IRL) is the problem of finding a reward function that generates a given optimal policy for a given Markov Decision Process. This paper looks at an algorithmic-independent geometric analysis of the IRL problem with finite states and actions. A L1-regularized Support Vector Machine formulation of the IRL problem motivated by the geometric analysis is then proposed with the basic objective of the inverse reinforcement problem in mind: to find a reward function that generates a specified optimal policy. The paper further analyzes the proposed formulation of inverse reinforcement learning with $n$ states and $k$ actions, and shows a sample complexity of $O(n^2 \log (nk))$ for recovering a reward function that generates a policy that satisfies Bellman's optimality condition with respect to the true transition probabilities.


On Coresets for Regularized Loss Minimization

arXiv.org Machine Learning

We design and mathematically analyze sampling-based algorithms for regularized loss minimization problems that are implementable in popular computational models for large data, in which the access to the data is restricted in some way. Our main result is that if the regularizer's effect does not become negligible as the norm of the hypothesis scales, and as the data scales, then a uniform sample of modest size is with high probability a coreset. In the case that the loss function is either logistic regression or soft-margin support vector machines, and the regularizer is one of the common recommended choices, this result implies that a uniform sample of size $O(d \sqrt{n})$ is with high probability a coreset of $n$ points in $\Re^d$. We contrast this upper bound with two lower bounds. The first lower bound shows that our analysis of uniform sampling is tight; that is, a smaller uniform sample will likely not be a core set. The second lower bound shows that in some sense uniform sampling is close to optimal, as significantly smaller core sets do not generally exist.


Neural-Symbolic Argumentation Mining: an Argument in Favour of Deep Learning and Reasoning

arXiv.org Artificial Intelligence

On the other hand, AM has rapidlyfrom a given document (Lippi 2016). Recent years have seen the development evolved by exploiting state-of-the-art neural architectures of a large number of techniques in this area, on coming from deep learning. So far, the wake of the advancements produced by deep these two worlds have progressed largely independently learning on the whole research field of natural of each other. Only recently, a few works language processing (NLP). Yet, it is widely recognized have taken some steps towards the integration of that the existing AM systems still have such methods, by applying techniques combining a large margin of improvement, as good results sub-symbolic classifiers with knowledge expressed have been obtained with some genres where prior in the form of rules and constraints to AM. knowledge on the structure of the text eases some Niculae et al. (2017) adopted structuredFor instance, AM tasks, but other genres such as legal cases support vector machines and recurrent neural and social media documents still require more networks to collectively classify argument components work (Cabrio and Villata, 2018). Performing and and their relations in short documents, understanding argumentation requires advanced by hard-coding contextual dependencies and constraints reasoning capabilities that are natural skills for humans, of the argument model in a factor graph. but which are difficult to learn for a machine. A joint inference approach for argument component Understanding whether a given piece of classification and relation identification was evidence supports a given claim, or whether two Persing and Ng (2016), followinginstead proposed by claims attack each other, are complex problems a pipeline scheme where integer linear programming that humans are able to address thanks to their is used to enforce mathematical constraints ability to exploit commonsense knowledge, and to on the outcomes of a first-stage set of classifiers.


Meniere's Disease Prognosis by Learning from Transient-Evoked Otoacoustic Emission Signals

arXiv.org Machine Learning

Accurate prognosis of Meniere disease (MD) is difficult. The aim of this study is to treat it as a machine-learning problem through the analysis of transient-evoked (TE) otoacoustic emission (OAE) data obtained from MD patients. Thirty-three patients who received treatment were recruited, and their distortion-product (DP) OAE, TEOAE, as well as pure-tone audiograms were taken longitudinally up to 6 months after being diagnosed with MD. By hindsight, the patients were separated into two groups: those whose outer hair cell (OHC) functions eventually recovered, and those that did not. TEOAE signals between 2.5-20 ms were dimension-reduced via principal component analysis, and binary classification was performed via the support vector machine. Through cross-validation, we demonstrate that the accuracy of prognosis can reach >80% based on data obtained at the first visit. Further analysis also shows that the TEOAE group delay at 1k and 2k Hz tend to be longer for the group of ears that eventually recovered their OHC functions. The group delay can further be compared between the MD-affected ear and the opposite ear. The present results suggest that TEOAE signals provide abundant information for the prognosis of MD and the information could be extracted by applying machine-learning techniques.


High-low level support vector regression prediction approach (HL-SVR) for data modeling with input parameters of unequal sample sizes

arXiv.org Machine Learning

Support vector regression (SVR) has been widely used to reduce the high computational cost of computer simulation. SVR assumes the input parameters have equal sample sizes, but unequal sample sizes are often encountered in engineering practices. To solve this issue, a new prediction approach based on SVR, namely as high-low-level SVR approach (HL-SVR) is proposed for data modeling of input parameters of unequal sample sizes in this paper. The proposed approach is consisted of low-level SVR models for the input parameters of larger sample sizes and high-level SVR model for the input parameters of smaller sample sizes. For each training point of the input parameters of smaller sample sizes, one low-level SVR model is built based on its corresponding input parameters of larger sample sizes and their responses of interest. The high-level SVR model is built based on the obtained responses from the low-level SVR models and the input parameters of smaller sample sizes. Several numerical examples are used to validate the performance of HL-SVR. The experimental results indicate that HL-SVR can produce more accurate prediction results than conventional SVR. The proposed approach is applied on the stress analysis of dental implant, which the structural parameters have massive samples but the material of implant can only be selected from several Ti and its alloys. The prediction performance of the proposed approach is much better than the conventional SVR. The proposed approach can be used for the design, optimization and analysis of engineering systems with input parameters of unequal sample sizes.


Infusing domain knowledge in AI-based "black box" models for better explainability with application in bankruptcy prediction

arXiv.org Artificial Intelligence

Although "black box" models such as Artificial Neural Networks, Support Vector Machines, and Ensemble Approaches continue to show superior performance in many disciplines, their adoption in the sensitive disciplines (e.g., finance, healthcare) is questionable due to the lack of interpretability and explainability of the model. In fact, future adoption of "black box" models is difficult because of the recent rule of "right of explanation" by the European Union where a user can ask for an explanation behind an algorithmic decision, and the newly proposed bill by the US government, the "Algorithmic Accountability Act", which would require companies to assess their machine learning systems for bias and discrimination and take corrective measures. Top Bankruptcy Prediction Models are A.I.-based and are in need of better explainability -the extent to which the internal working mechanisms of an AI system can be explained in human terms. Although explainable artificial intelligence is an emerging field of research, infusing domain knowledge for better explainability might be a possible solution. In this work, we demonstrate a way to collect and infuse domain knowledge into a "black box" model for bankruptcy prediction. Our understanding from the experiments reveals that infused domain knowledge makes the output from the black box model more interpretable and explainable.


TMLab SRPOL at SemEval-2019 Task 8: Fact Checking in Community Question Answering Forums

arXiv.org Machine Learning

The article describes our submission to SemEval 2019 Task 8 on Fact-Checking in Community Forums. The systems under discussion participated in Subtask A: decide whether a question asks for factual information, opinion/advice or is just socializing. Our primary submission was ranked as the second one among all participants in the official evaluation phase. The article presents our primary solution: Deeply Regularized Residual Neural Network (DRR NN) with Universal Sentence Encoder embeddings. This is followed by a description of two contrastive solutions based on ensemble methods.


A Music Classification Model based on Metric Learning and Feature Extraction from MP3 Audio Files

arXiv.org Machine Learning

The development of models for learning music similarity and feature extraction from audio media files is an increasingly important task for the entertainment industry. This work proposes a novel music classification model based on metric learning and feature extraction from MP3 audio files. The metric learning process considers the learning of a set of parameterized distances employing a structured prediction approach from a set of MP3 audio files containing several music genres. The main objective of this work is to make possible learning a personalized metric for each customer. To extract the acoustic information we use the Mel-Frequency Cepstral Coefficient (MFCC) and make a dimensionality reduction with the use of Principal Components Analysis. We attest the model validity performing a set of experiments and comparing the training and testing results with baseline algorithms, such as K-means and Soft Margin Linear Support Vector Machine (SVM). Experiments show promising results and encourage the future development of an online version of the learning model.