Goto

Collaborating Authors

 Support Vector Machines


Max-Margin Invariant Features from Transformed Unlabelled Data

Neural Information Processing Systems

The study of representations invariant to common transformations of the data is important to learning. Most techniques have focused on local approximate invariance implemented within expensive optimization frameworks lacking explicit theoretical guarantees. In this paper, we study kernels that are invariant to a unitary group while having theoretical guarantees in addressing the important practical issue of unavailability of transformed versions of labelled data. A problem we call the Unlabeled Transformation Problem which is a special form of semisupervised learning and one-shot learning. We present a theoretically motivated alternate approach to the invariant kernel SVM based on which we propose Max-Margin Invariant Features (MMIF) to solve this problem. As an illustration, we design an framework for face recognition and demonstrate the efficacy of our approach on a large scale semi-synthetic dataset with 153,000 images and a new challenging protocol on Labelled Faces in the Wild (LFW) while out-performing strong baselines.


Process-constrained batch Bayesian optimisation

Neural Information Processing Systems

Prevailing batch Bayesian optimisation methods allow all control variables to be freely altered at each iteration. Real-world experiments, however, often have physical limitations making it time-consuming to alter all settings for each recommendation in a batch. This gives rise to a unique problem in BO: in a recommended batch, a set of variables that are expensive to experimentally change need to be fixed, while the remaining control variables can be varied. We formulate this as a process-constrained batch Bayesian optimisation problem. We propose two algorithms, pc-BO(basic) and pc-BO(nested).


Review Non-convex Optimization Method for Machine Learning

arXiv.org Artificial Intelligence

Non-convex optimization is a critical tool in advancing machine learning, especially for complex models like deep neural networks and support vector machines. Despite challenges such as multiple local minima and saddle points, non-convex techniques offer various pathways to reduce computational costs. These include promoting sparsity through regularization, efficiently escaping saddle points, and employing subsampling and approximation strategies like stochastic gradient descent. Additionally, non-convex methods enable model pruning and compression, which reduce the size of models while maintaining performance. By focusing on good local minima instead of exact global minima, non-convex optimization ensures competitive accuracy with faster convergence and lower computational overhead. This paper examines the key methods and applications of non-convex optimization in machine learning, exploring how it can lower computation costs while enhancing model performance. Furthermore, it outlines future research directions and challenges, including scalability and generalization, that will shape the next phase of non-convex optimization in machine learning.


Overpredictive Signal Analytics in Federated Learning: Algorithms and Analysis

arXiv.org Machine Learning

Edge signal processing facilitates distributed learning and inference in the client-server model proposed in federated learning. In traditional machine learning, clients (IoT devices) that acquire raw signal samples can aid a data center (server) learn a global signal model by pooling these distributed samples at a third-party location. Despite the promising capabilities of IoTs, these distributed deployments often face the challenge of sensitive private data and communication rate constraints. This necessitates a learning approach that communicates a processed approximation of the distributed samples instead of the raw signals. Such a decentralized learning approach using signal approximations will be termed distributed signal analytics in this work. Overpredictive signal approximations may be desired for distributed signal analytics, especially in network demand (capacity) planning applications motivated by federated learning. In this work, we propose algorithms that compute an overpredictive signal approximation at the client devices using an efficient convex optimization framework. Tradeoffs between communication cost, sampling rate, and the signal approximation error are quantified using mathematical analysis. We also show the performance of the proposed distributed algorithms on a publicly available residential energy consumption dataset.


Heterogeneous sound classification with the Broad Sound Taxonomy and Dataset

arXiv.org Artificial Intelligence

Automatic sound classification has a wide range of applications in machine listening, enabling context-aware sound processing and understanding. This paper explores methodologies for automatically classifying heterogeneous sounds characterized by high intra-class variability. Our study evaluates the classification task using the Broad Sound Taxonomy, a two-level taxonomy comprising 28 classes designed to cover a heterogeneous range of sounds with semantic distinctions tailored for practical user applications. We construct a dataset through manual annotation to ensure accuracy, diverse representation within each class and relevance in real-world scenarios. We compare a variety of both traditional and modern machine learning approaches to establish a baseline for the task of heterogeneous sound classification. We investigate the role of input features, specifically examining how acoustically derived sound representations compare to embeddings extracted with pre-trained deep neural networks that capture both acoustic and semantic information about sounds. Experimental results illustrate that audio embeddings encoding acoustic and semantic information achieve higher accuracy in the classification task. After careful analysis of classification errors, we identify some underlying reasons for failure and propose actions to mitigate them. The paper highlights the need for deeper exploration of all stages of classification, understanding the data and adopting methodologies capable of effectively handling data complexity and generalizing in real-world sound environments.


Using fractal dimension to predict the risk of intra cranial aneurysm rupture with machine learning

arXiv.org Artificial Intelligence

Intracranial aneurysms (IAs) that rupture result in significant morbidity and mortality. While traditional risk models such as the PHASES score are useful in clinical decision making, machine learning (ML) models offer the potential to provide more accuracy. In this study, we compared the performance of four different machine learning algorithms Random Forest (RF), XGBoost (XGB), Support Vector Machine (SVM), and Multi Layer Perceptron (MLP) on clinical and radiographic features to predict rupture status of intracranial aneurysms. Among the models, RF achieved the highest accuracy (85%) with balanced precision and recall, while MLP had the lowest overall performance (accuracy of 63%). Fractal dimension ranked as the most important feature for model performance across all models.


A Proximal Modified Quasi-Newton Method for Nonsmooth Regularized Optimization

arXiv.org Artificial Intelligence

We develop R2N, a modified quasi-Newton method for minimizing the sum of a $\mathcal{C}^1$ function $f$ and a lower semi-continuous prox-bounded $h$. Both $f$ and $h$ may be nonconvex. At each iteration, our method computes a step by minimizing the sum of a quadratic model of $f$, a model of $h$, and an adaptive quadratic regularization term. A step may be computed by a variant of the proximal-gradient method. An advantage of R2N over trust-region (TR) methods is that proximal operators do not involve an extra TR indicator. We also develop the variant R2DH, in which the model Hessian is diagonal, which allows us to compute a step without relying on a subproblem solver when $h$ is separable. R2DH can be used as standalone solver, but also as subproblem solver inside R2N. We describe non-monotone variants of both R2N and R2DH. Global convergence of a first-order stationarity measure to zero holds without relying on local Lipschitz continuity of $\nabla f$, while allowing model Hessians to grow unbounded, an assumption particularly relevant to quasi-Newton models. Under Lipschitz-continuity of $\nabla f$, we establish a tight worst-case complexity bound of $O(1 / \epsilon^{2/(1 - p)})$ to bring said measure below $\epsilon > 0$, where $0 \leq p < 1$ controls the growth of model Hessians. The latter must not diverge faster than $|\mathcal{S}_k|^p$, where $\mathcal{S}_k$ is the set of successful iterations up to iteration $k$. When $p = 1$, we establish the tight exponential complexity bound $O(\exp(c \epsilon^{-2}))$ where $c > 0$ is a constant. We describe our Julia implementation and report numerical experience on a basis-pursuit problem, image denoising, minimum-rank matrix completion, and a nonlinear support vector machine. In particular, the minimum-rank problem cannot be solved directly at this time by a TR approach as corresponding proximal operators are not known analytically.


Decoding Android Malware with a Fraction of Features: An Attention-Enhanced MLP-SVM Approach

arXiv.org Artificial Intelligence

The escalating sophistication of Android malware poses significant challenges to traditional detection methods, necessitating innovative approaches that can efficiently identify and classify threats with high precision. This paper introduces a novel framework that synergistically integrates an attention-enhanced Multi-Layer Perceptron (MLP) with a Support Vector Machine (SVM) to make Android malware detection and classification more effective. By carefully analyzing a mere 47 features out of over 9,760 available in the comprehensive CCCS-CIC-AndMal-2020 dataset, our MLP-SVM model achieves an impressive accuracy over 99% in identifying malicious applications. The MLP, enhanced with an attention mechanism, focuses on the most discriminative features and further reduces the 47 features to only 14 components using Linear Discriminant Analysis (LDA). Despite this significant reduction in dimensionality, the SVM component, equipped with an RBF kernel, excels in mapping these components to a high-dimensional space, facilitating precise classification of malware into their respective families. Rigorous evaluations, encompassing accuracy, precision, recall, and F1-score metrics, confirm the superiority of our approach compared to existing state-of-the-art techniques. The proposed framework not only significantly reduces the computational complexity by leveraging a compact feature set but also exhibits resilience against the evolving Android malware landscape.


Shifting from endangerment to rebirth in the Artificial Intelligence Age: An Ensemble Machine Learning Approach for Hawrami Text Classification

arXiv.org Artificial Intelligence

Hawrami, a dialect of Kurdish, is classified as an endangered language as it suffers from the scarcity of data and the gradual loss of its speakers. Natural Language Processing projects can be used to partially compensate for data availability for endangered languages/dialects through a variety of approaches, such as machine translation, language model building, and corpora development. Similarly, NLP projects such as text classification are in language documentation. Several text classification studies have been conducted for Kurdish, but they were mainly dedicated to two particular dialects: Sorani (Central Kurdish) and Kurmanji (Northern Kurdish). In this paper, we introduce various text classification models using a dataset of 6,854 articles in Hawrami labeled into 15 categories by two native speakers. We use K-nearest Neighbor (KNN), Linear Support Vector Machine (Linear SVM), Logistic Regression (LR), and Decision Tree (DT) to evaluate how well those methods perform the classification task. The results indicate that the Linear SVM achieves a 96% of accuracy and outperforms the other approaches.


Efficient Collision Detection Framework for Enhancing Collision-Free Robot Motion

arXiv.org Artificial Intelligence

Fast and efficient collision detection is essential for motion generation in robotics. In this paper, we propose an efficient collision detection framework based on the Signed Distance Field (SDF) of robots, seamlessly integrated with a self-collision detection module. Firstly, we decompose the robot's SDF using forward kinematics and leverage multiple extremely lightweight networks in parallel to efficiently approximate the SDF. Moreover, we introduce support vector machines to integrate the self-collision detection module into the framework, which we refer to as the SDF-SC framework. Using statistical features, our approach unifies the representation of collision distance for both SDF and self-collision detection. During this process, we maintain and utilize the differentiable properties of the framework to optimize collision-free robot trajectories. Finally, we develop a reactive motion controller based on our framework, enabling real-time avoidance of multiple dynamic obstacles. While maintaining high accuracy, our framework achieves inference speeds up to five times faster than previous methods. Experimental results on the Franka robotic arm demonstrate the effectiveness of our approach.