Support Vector Machines
Pain Intensity Estimation from Mobile Video Using 2D and 3D Facial Keypoints
Lee, Matthew, Kennedy, Lyndon, Girgensohn, Andreas, Wilcox, Lynn, Lee, John Song En, Tan, Chin Wen, Sng, Ban Leong
For the more than 300 million surgeries performed worldwide every year, managing post-surgical pain is critical for successful surgical outcomes. Pain is the most prominent post-surgical concern, with an estimated 86% of surgical patients in the United States experiencing pain after surgery, with 75% of these patients reporting at least moderate to extreme pain [1]. Higher postoperative pain is associated with more postoperative complications [2], indicating the importance of pain management. Furthermore, the use of opioid analgesics is a powerful tool for managing pain but can pose risks of adverse drug events (experienced by 10% of surgical patients), leading to prolonged length of stay, high hospitalization costs, and potentially addiction [3]. Thus, regular and careful pain assessment is important for balancing between pain relief and potential side effects of powerful opioid analgesics [4]. However, one of the challenges of pain management is accurately assessing the pain level of patients. Pain is inherently subjective, and the personal experience of pain is difficult to observe and measure objectively by those not experiencing it [5] (e.g., care providers). The standard practice used in clinical care requires patients to self-report their pain intensity level using a numeric or visual scale, such as the popular Numerical Pain Rating Scale [6].
Improved Complexities for Stochastic Conditional Gradient Methods under Interpolation-like Conditions
Xiao, Tesi, Balasubramanian, Krishnakumar, Ghadimi, Saeed
In a machine learning setup, the function F could be interpreted as the loss function associated with a sample ξ and the function f could represent the risk, which is defined as the expected loss. Such constrained stochastic optimization problems arise frequently in statistical machine learning applications. Conditional gradient algorithm, also called as Frank-Wolfe algorithm, is an efficient method for solving constrained optimization problems of the form in (1) due to their projection-free nature [Jag13, HJN15, FGM17, LPZZ17, BZK18, RDLS18]. In each step of the conditional gradient method, it is only required to minimize a linear objective over the set Ω. This operation could be implemented efficiently for a variety of sets arising in statistical machine learning, compared to the operation of projecting on to the set Ω, which is required for example by the projected gradient method. Hence, conditional gradient method has regained popularity in the last decade in the optimization and machine learning community.
Infinite Feature Selection: A Graph-based Feature Filtering Approach
Roffo, Giorgio, Melzi, Simone, Castellani, Umberto, Vinciarelli, Alessandro, Cristani, Marco
We propose a filtering feature selection framework that considers subsets of features as paths in a graph, where a node is a feature and an edge indicates pairwise (customizable) relations among features, dealing with relevance and redundancy principles. By two different interpretations (exploiting properties of power series of matrices and relying on Markov chains fundamentals) we can evaluate the values of paths (i.e., feature subsets) of arbitrary lengths, eventually go to infinite, from which we dub our framework Infinite Feature Selection (Inf-FS). Going to infinite allows to constrain the computational complexity of the selection process, and to rank the features in an elegant way, that is, considering the value of any path (subset) containing a particular feature. We also propose a simple unsupervised strategy to cut the ranking, so providing the subset of features to keep. In the experiments, we analyze diverse settings with heterogeneous features, for a total of 11 benchmarks, comparing against 18 widely-known comparative approaches. The results show that Inf-FS behaves better in almost any situation, that is, when the number of features to keep are fixed a priori, or when the decision of the subset cardinality is part of the process.
What is Boosting in Machine Learning?
In this post, we will see a simple and intuitive explanation of Boosting algorithms: what they are, why they are so powerful, some of the different types, and how they are trained and used to make predictions. We will avoid all the heavy maths and go for a clear, simple, but in depth explanation that can be easily understood. However, additional material and resources will be left at the end of the post, in case you want to dive further into the topic. Traditionally, building a Machine Learning application consisted on taking a single learner, like a Logistic Regressor, a Decision Tree, Support Vector Machine, or an Artificial Neural Network, feeding it data, and teaching it to perform a certain task through this data. Then ensemble methods were born, which involve using many learners to enhance the performance of any single one of them individually.
Weston-Watkins Hinge Loss and Ordered Partitions
Wang, Yutong, Scott, Clayton D.
Multiclass extensions of the support vector machine (SVM) have been formulated in a variety of ways. A recent empirical comparison of nine such formulations [Do\v{g}an et al. 2016] recommends the variant proposed by Weston and Watkins (WW), despite the fact that the WW-hinge loss is not calibrated with respect to the 0-1 loss. In this work we introduce a novel discrete loss function for multiclass classification, the ordered partition loss, and prove that the WW-hinge loss is calibrated with respect to this loss. We also argue that the ordered partition loss is maximally informative among discrete losses satisfying this property. Finally, we apply our theory to justify the empirical observation made by Do\v{g}an et al. that the WW-SVM can work well even under massive label noise, a challenging setting for multiclass SVMs.
Safety-guaranteed Reinforcement Learning based on Multi-class Support Vector Machine
Kim, Kwangyeon, Gupta, Akshita, Choi, Hong-Cheol, Hwang, Inseok
Several works have addressed the problem of incorporating constraints in the reinforcement learning (RL) framework, however majority of them can only guarantee the satisfaction of soft constraints. In this work, we address the problem of satisfying hard state constraints in a model-free RL setting with the deterministic system dynamics. The proposed algorithm is developed for the discrete state and action space and utilizes a multi-class support vector machine (SVM) to represent the policy. The state constraints are incorporated in the SVM optimization framework to derive an analytical solution for determining the policy parameters. This final policy converges to a solution which is guaranteed to satisfy the constraints. Additionally, the proposed formulation adheres to the Q-learning framework and thus, also guarantees convergence to the optimal solution. The algorithm is demonstrated with multiple example problems.
Smartphone Transportation Mode Recognition Using a Hierarchical Machine Learning Classifier and Pooled Features From Time and Frequency Domains
Ashqar, Huthaifa I., Almannaa, Mohammed H., Elhenawy, Mohammed, Rakha, Hesham A., House, Leanna
This paper develops a novel two-layer hierarchical classifier that increases the accuracy of traditional transportation mode classification algorithms. This paper also enhances classification accuracy by extracting new frequency domain features. Many researchers have obtained these features from global positioning system data; however, this data was excluded in this paper, as the system use might deplete the smartphone's battery and signals may be lost in some areas. Our proposed two-layer framework differs from previous classification attempts in three distinct ways: 1) the outputs of the two layers are combined using Bayes' rule to choose the transportation mode with the largest posterior probability; 2) the proposed framework combines the new extracted features with traditionally used time domain features to create a pool of features; and 3) a different subset of extracted features is used in each layer based on the classified modes. Several machine learning techniques were used, including k-nearest neighbor, classification and regression tree, support vector machine, random forest, and a heterogeneous framework of random forest and support vector machine. Results show that the classification accuracy of the proposed framework outperforms traditional approaches. Transforming the time domain features to the frequency domain also adds new features in a new space and provides more control on the loss of information. Consequently, combining the time domain and the frequency domain features in a large pool and then choosing the best subset results in higher accuracy than using either domain alone. The proposed two-layer classifier obtained a maximum classification accuracy of 97.02%.
A Novel Meta-Heuristic Optimization Algorithm Inspired by the Spread of Viruses
According to the no-free-lunch theorem, there is no single meta-heuristic algorithm that can optimally solve all optimization problems. This motivates many researchers to continuously develop new optimization algorithms. In this paper, a novel nature-inspired meta-heuristic optimization algorithm called virus spread optimization (VSO) is proposed. VSO loosely mimics the spread of viruses among hosts, and can be effectively applied to solving many challenging and continuous optimization problems. We devise a new representation scheme and viral operations that are radically different from previously proposed virus-based optimization algorithms. First, the viral RNA of each host in VSO denotes a potential solution for which different viral operations will help to diversify the searching strategies in order to largely enhance the solution quality. In addition, an imported infection mechanism, inheriting the searched optima from another colony, is introduced to possibly avoid the prematuration of any potential solution in solving complex problems. VSO has an excellent capability to conduct adaptive neighborhood searches around the discovered optima for achieving better solutions. Furthermore, with a flexible infection mechanism, VSO can quickly escape from local optima. To clearly demonstrate both its effectiveness and efficiency, VSO is critically evaluated on a series of well-known benchmark functions. Moreover, VSO is validated on its applicability through two real-world examples including the financial portfolio optimization and optimization of hyper-parameters of support vector machines for classification problems. The results show that VSO has attained superior performance in terms of solution fitness, convergence rate, scalability, reliability, and flexibility when compared to those results of the conventional as well as state-of-the-art meta-heuristic optimization algorithms.
SLIC-UAV: A Method for monitoring recovery in tropical restoration projects through identification of signature species using UAVs
Williams, Jonathan, Schönlieb, Carola-Bibiane, Swinfield, Tom, Irawan, Bambang, Achmad, Eva, Zudhi, Muhammad, Habibi, null, Gemita, Elva, Coomes, David A.
Logged forests cover four million square kilometres of the tropics and restoring these forests is essential if we are to avoid the worst impacts of climate change, yet monitoring recovery is challenging. Tracking the abundance of visually identifiable, early-successional species enables successional status and thereby restoration progress to be evaluated. Here we present a new pipeline, SLIC-UAV, for processing Unmanned Aerial Vehicle (UAV) imagery to map early-successional species in tropical forests. The pipeline is novel because it comprises: (a) a time-efficient approach for labelling crowns from UAV imagery; (b) machine learning of species based on spectral and textural features within individual tree crowns, and (c) automatic segmentation of orthomosaiced UAV imagery into 'superpixels', using Simple Linear Iterative Clustering (SLIC). Creating superpixels reduces the dataset's dimensionality and focuses prediction onto clusters of pixels, greatly improving accuracy. To demonstrate SLIC-UAV, support vector machines and random forests were used to predict the species of hand-labelled crowns in a restoration concession in Indonesia. Random forests were most accurate at discriminating species for whole crowns, with accuracy ranging from 79.3% when mapping five common species, to 90.5% when mapping the three most visually-distinctive species. In contrast, support vector machines proved better for labelling automatically segmented superpixels, with accuracy ranging from 74.3% to 91.7% for the same species. Models were extended to map species across 100 hectares of forest. The study demonstrates the power of SLIC-UAV for mapping characteristic early-successional tree species as an indicator of successional stage within tropical forest restoration areas. Continued effort is needed to develop easy-to-implement and low-cost technology to improve the affordability of project management.
Predicting Engagement in Video Lectures
Bulathwela, Sahan, Pérez-Ortiz, María, Lipani, Aldo, Yilmaz, Emine, Shawe-Taylor, John
The explosion of Open Educational Resources (OERs) in the recent years creates the demand for scalable, automatic approaches to process and evaluate OERs, with the end goal of identifying and recommending the most suitable educational materials for learners. We focus on building models to find the characteristics and features involved in context-agnostic engagement (i.e. population-based), a seldom researched topic compared to other contextualised and personalised approaches that focus more on individual learner engagement. Learner engagement, is arguably a more reliable measure than popularity/number of views, is more abundant than user ratings and has also been shown to be a crucial component in achieving learning outcomes. In this work, we explore the idea of building a predictive model for population-based engagement in education. We introduce a novel, large dataset of video lectures for predicting context-agnostic engagement and propose both cross-modal and modality-specific feature sets to achieve this task. We further test different strategies for quantifying learner engagement signals. We demonstrate the use of our approach in the case of data scarcity. Additionally, we perform a sensitivity analysis of the best performing model, which shows promising performance and can be easily integrated into an educational recommender system for OERs.